
Contents
Environmental forecasting suites generate forecast products from a potentially large group of interdependent scientific models and associated data processing tasks. They are constrained by availability of external driving data: typically one or more tasks will wait on real time observations and/or model data from an external system, and these will drive other downstream tasks, and so on. The dependency diagram for a single forecast cycle point in such a system is a Directed Acyclic Graph as shown in Figure 1 (in our terminology, a forecast cycle point is comprised of all tasks with a common cycle point, which is the nominal analysis time or start time of the forecast models in the group). In real time operation processing will consist of a series of distinct forecast cycle points that are each initiated, after a gap, by arrival of the new cycle point’s external driving data.
From a job scheduling perspective task execution order in such a system must be carefully controlled in order to avoid dependency violations. Ideally, each task should be queued for execution at the instant its last prerequisite is satisfied; this is the best that can be done even if queued tasks are not able to execute immediately because of resource contention.
Cylc was developed for the EcoConnect Forecasting System at NIWA (National Institute of Water and Atmospheric Research, New Zealand). EcoConnect takes real time atmospheric and stream flow observations, and operational global weather forecasts from the Met Office (UK), and uses these to drive global sea state and regional data assimilating weather models, which in turn drive regional sea state, storm surge, and catchment river models, plus tide prediction, and a large number of associated data collection, quality control, preprocessing, post-processing, product generation, and archiving tasks.1 The global sea state forecast runs once daily. The regional weather forecast runs four times daily but it supplies surface winds and pressure to several downstream models that run only twice daily, and precipitation accumulations to catchment river models that run on an hourly cycle assimilating real time stream flow observations and using the most recently available regional weather forecast. EcoConnect runs on heterogeneous distributed hardware, including a massively parallel supercomputer and several Linux servers.
Most dependence between tasks applies within a single forecast cycle point. Figure 1 shows the dependency diagram for a single forecast cycle point of a simple example suite of three forecast models (a, b, and c) and three post processing or product generation tasks (d, e and f). A scheduler capable of handling this must manage, within a single forecast cycle point, multiple parallel streams of execution that branch when one task generates output for several downstream tasks, and merge when one task takes input from several upstream tasks.


Figure 2 shows the optimal job schedule for two consecutive cycle points of the example suite in real time operation, given execution times represented by the horizontal extent of the task bars. There is a time gap between cycle points as the suite waits on new external driving data. Each task in the example suite happens to trigger off upstream tasks finishing, rather than off any intermediate output or event; this is merely a simplification that makes for clearer diagrams.


Now the question arises, what happens if the external driving data for upcoming cycle points is available in advance, as it would be after a significant delay in operations, or when running a historical case study? While the forecast model a appears to depend only on the external data x at this stage of the discussion, in fact it would typically also depend on its own previous instance for the model background state used in initializing the new forecast. Thus, as alluded to in Figure 3, task a could in principle start as soon as its predecessor has finished. Figure 4 shows, however, that starting a whole new cycle point at this point is dangerous - it results in dependency violations in half of the tasks in the example suite. In fact the situation could be even worse than this - imagine that task b in the first cycle point is delayed for some reason after the second cycle point has been launched. Clearly we must consider handling inter-cycle dependence explicitly or else agree not to start the next cycle point early, as is illustrated in Figure 5.
Forecast models typically depend on their own most recent previous forecast for background state or restart files of some kind (this is called warm cycling) but there can also be inter-cycle dependence between different tasks. In an atmospheric forecast analysis suite, for instance, the weather model may generate background states for observation processing and data-assimilation tasks in the next cycle point as well as for the next forecast model run. In real time operation inter-cycle dependence can be ignored because it is automatically satisfied when one cycle point finishes before the next begins. If it is not ignored it drastically complicates the dependency graph by blurring the clean boundary between cycle points. Figure 6 illustrates the problem for our simple example suite assuming minimal inter-cycle dependence: the warm cycled models (a, b, and c) each depend on their own previous instances.
For this reason, and because we tend to see forecasting suites in terms of their real time characteristics, other metaschedulers have ignored inter-cycle dependence and are thus restricted to running entire cycle points in sequence at all times. This does not affect normal real time operation but it can be a serious impediment when advance availability of external driving data makes it possible, in principle, to run some tasks from upcoming cycle points before the current cycle point is finished - as was suggested at the end of the previous section. This can occur, for instance, after operational delays (late arrival of external data, system maintenance, etc.) and to an even greater extent in historical case studies and parallel test suites started behind a real time operation. It can be a serious problem for suites that have little downtime between forecast cycle points and therefore take many cycle points to catch up after a delay. Without taking account of inter-cycle dependence, the best that can be done, in general, is to reduce the gap between cycle points to zero as shown in Figure 5. A limited crude overlap of the single cycle point job schedule may be possible for specific task sets but the allowable overlap may change if new tasks are added, and it is still dangerous: it amounts to running different parts of a dependent system as if they were not dependent and as such it cannot be guaranteed that some unforeseen delay in one cycle point, after the next cycle point has begun, (e.g. due to resource contention or task failures) won’t result in dependency violations.


Figure 7 shows, in contrast to Figure 4, the optimal two cycle point job schedule obtained by respecting all inter-cycle dependence. This assumes no delays due to resource contention or otherwise - i.e. every task runs as soon as it is ready to run. The scheduler running this suite must be able to adapt dynamically to external conditions that impact on multi-cycle-point scheduling in the presence of inter-cycle dependence or else, again, risk bringing the system down with dependency violations.


To further illustrate the potential benefits of proper inter-cycle dependency handling, Figure 8 shows an operational delay of almost one whole cycle point in a suite with little downtime between cycle points. Above the time axis is the optimal schedule that is possible in principle when inter-cycle dependence is taken into account, and below it is the only safe schedule possible in general when it is ignored. In the former case, even the cycle point immediately after the delay is hardly affected, and subsequent cycle points are all on time, whilst in the latter case it takes five full cycle points to catch up to normal real time operation.
Similarly, Figure 9 shows example suite job schedules for an historical case study, or when catching up after a very long delay; i.e. when the external driving data are available many cycle points in advance. Task a, which as the most upstream forecast model is likely to be a resource intensive atmosphere or ocean model, has no upstream dependence on co-temporal tasks and can therefore run continuously, regardless of how much downstream processing is yet to be completed in its own, or any previous, forecast cycle point (actually, task a does depend on co-temporal task x which waits on the external driving data, but that returns immediately when the data is available in advance, so the result stands). The other forecast models can also cycle continuously or with a short gap between, and some post processing tasks, which have no previous-instance dependence, can run continuously or even overlap (e.g. e in this case). Thus, even for this very simple example suite, tasks from three or four different cycle points can in principle run simultaneously at any given time.
In fact, if our tasks are able to trigger off internal outputs of upstream tasks (message triggers) rather than waiting on full completion, then successive instances of the forecast models could overlap as well (because model restart outputs are generally completed early in the forecast) for an even more efficient job schedule.

Cylc manages a pool of proxy objects that represent the real tasks in a suite. Task proxies know how to run the real tasks that they represent, and they receive progress messages from the tasks as they run (usually reports of completed outputs). There is no global cycling mechanism to advance the suite; instead individual task proxies have their own private cycle point and spawn their own successors when the time is right. Task proxies are self-contained - they know their own prerequisites and outputs but are not aware of the wider suite. Inter-cycle dependence is not treated as special, and the task pool can be populated with tasks with many different cycle points. The task pool is illustrated in Figure 10. Whenever any task changes state due to completion of an output, every task checks to see if its own prerequisites have been satisfied. In effect, cylc gets a pool of tasks to self-organize by negotiating their own dependencies so that optimal scheduling, as described in the previous section, emerges naturally at run time.
Cylc runs on Linux. It is tested quite thoroughly on modern RHEL and Ubuntu distros. Some users have also managed to make it work on other Unix variants including Apple OS X, but they are not officially tested and supported.
Python 2 >= 2.6 is required. Python 2 >= 2.7.9 is recommended for the best security. Python 2 should already be installed in your Linux system. https://python.org/.
For Cylc’s HTTPS communications layer:
The following packages are highly recommended, but are technically optional as you can construct and run suites without dependency graph visualisation or the Cylc GUIs:
The User Guide is generated from LATEXsource files by running make in the top level Cylc directory. The specific packages required may vary by distribution, e.g.:
To generate the HTML User Guide ImageMagick is also needed.
In most modern Linux distributions all of the software above can be installed via the system package manager. Otherwise download packages manually and follow their native installation instructions. To check that all (non LATEXpackages) are installed properly:
If errors are reported then the packages concerned are either not installed or not in your Python search path. (Note that cylc check-software has become quite trivial as we’ve removed or bundled some former dependencies, but in future we intend to make it print a comprehensive list of library versions etc. to include in with bug reports.)
Cylc bundles several third party packages which do not need to be installed separately.
Cylc releases can be downloaded from from https://cylc.github.io/cylc.
The wrapper script usr/bin/cylc should be installed to the system executable search path (e.g. /usr/local/bin/) and modified slightly to point to a location such as /opt where successive Cylc releases will be unpacked side by side.
To install Cylc, unpack the release tarball in the right location, e.g. /opt/cylc-7.7.0, type make inside the release directory, and set site defaults - if necessary - in a site global config file (below).
Make a symbolic link from cylc to the latest installed version: ln -s /opt/cylc-7.7.0 /opt/cylc. This will be invoked by the central wrapper if a specific version is not requested. Otherwise, the wrapper will attempt to invoke the Cylc version specified in $CYLC_VERSION, e.g. CYLC_VERSION=7.7.0. This variable is automatically set in task job scripts to ensure that jobs use the same Cylc version as their parent suite server program. It can also be set by users, manually or in login scripts, to fix the Cylc version in their environment.
Installing subsequent releases is just a matter of unpacking the new tarballs next to the previous releases, running make in them, and copying in (possibly with modifications) the previous site global config file.
It is easy to install Cylc under your own user account if you don’t have root or sudo access to the system: just put the central Cylc wrapper in $HOME/bin/ (making sure that is in your $PATH) and modify it to point to a directory such as $HOME/cylc/ where you will unpack and install release tarballs. Local installation of third party dependencies like Graphviz is also possible, but that depends on the particular installation methods used and is outside of the scope of this document.
Site and user global config files define some important parameters that affect all suites, some of which may need to be customized for your site. See 6 for how to generate an initial site file and where to install it. All legal site and user global config items are defined in B.
If your users submit task jobs to hosts other than the hosts they use to run their suites, you should ensure that the job hosts have the correct environment for running cylc. A cylc suite generates task job scripts that normally invoke bash -l, i.e. it will invoke bash as a login shell to run the job script. Users and sites should ensure that their bash login profiles are able to set up the correct environment for running cylc and their task jobs.
Your site administrator may customise the environment for all task jobs by adding a <cylc-dir>/etc/job-init-env.sh file and populate it with the appropriate contents. If customisation is still required, you can add your own ${HOME}/.cylc/job-init-env.sh file and populate it with the appropriate contents.
The job will attempt to source the first of these files it finds to set up its environment.
The cylc test battery is primarily intended for developers to check that changes to the source code don’t break existing functionality. Note that some test failures can be expected to result from suites timing out, even if nothing is wrong, if you run too many tests in parallel. See cylc test-battery --help.
A job is a program or script that runs on a computer, and a task is a workflow abstraction - a node in the suite dependency graph - that represents a job.
A cycle point is a particular date-time (or integer) point in a sequence of date-time (or integer) points. Each cylc task has a private cycle point and can advance independently to subsequent cycle points. It may sometimes be convenient, however, to refer to the “current cycle point” of a suite (or the previous or next one, etc.) with reference to a particular task, or in the sense of all tasks instances that “belong to” a particular cycle point. But keep in mind that different tasks may pass through the “current cycle point” (etc.) at different times as the suite evolves.
A model run and associated processing may need to be cycled for the following reasons:
Cylc provides two ways of constructing workflows for cycling systems: cycling workflows and parameterized tasks.
This is cylc’s classic cycling mode as described in the Introduction. Each instance of a cycling job is represented by a new instance of the same task, with a new cycle point. The suite configuration defines patterns for extending the workflow on the fly, so it can keep running indefinitely if necessary. For example, to cycle model.exe on a monthly sequence we could define a single task model, an initial cycle point, and a monthly sequence. Cylc then generates the date-time sequence and creates a new task instance for each cycle point as it comes up. Workflow dependencies are defined generically with respect to the “current cycle point” of the tasks involved.
This is the only sensible way to run very large suites or operational suites that need to continue cycling indefinitely. The cycling is configured with standards-based ISO 8601 date-time recurrence expressions. Multiple cycling sequences can be used at once in the same suite. See Section 9.3.
It is also possible to run cycling jobs with a pre-defined static workflow in which each instance of a cycling job is represented by a different task: as far as the abstract workflow is concerned there is no cycling. The sequence of tasks can be constructed efficiently, however, using cylc’s built-in suite parameters (9.6.7) or explicit Jinja2 loops (9.7).
For example, to run model.exe 12 times on a monthly cycle we could loop over an integer parameter R = 0, 1, 2, ..., 11 to define tasks model-R0, model-R1, model-R2, ...model-R11, and the parameter values could be multiplied by the interval P1M (one month) to get the start point point for the corresponding model run.
This method is only good for smaller workflows of finite duration because every single task has to be mapped out in advance, and cylc has to be aware of all of them throughout the entire run. Additionally Cylc’s cycling workflow capabilities (above) are more powerful, more flexible, and generally easier to use (Cylc will generate the cycle point date-times for you, for instance), so that is the recommended way to drive most cycling systems.
The primary use for parameterized tasks in cylc is to generate ensembles and other groups of related tasks at the same cycle point, not as a proxy for cycling.
For completeness we note that parameterized cycling can be used within a cycling workflow. For example, in a daily cycling workflow long (daily) model runs could be split into four shorter runs by parameterized cycling. A simpler six-hourly cycling workflow should be considered first, however.
Cylc site and user global configuration files contain settings that affect all suites. Some of these, such as the range of network ports used by cylc, should be set at site level. Legal items, values, and system defaults are documented in (B).
Others, such as the preferred text editor for suite definitions, can be overridden by users,
The file <cylc-dir>/etc/global.rc.eg contains instructions on how to generate and install site and user global config files:
This section provides a hands-on tutorial introduction to basic cylc functionality.
Some settings affecting cylc’s behaviour can be defined in site and user global config files. For example, to choose the text editor invoked by cylc on suite definitions:
See 3.3.3 for information.
You should have access to the cylc command line (CLI) and graphical (GUI) user interfaces once cylc has been installed as described in Section 3.3.
The command line interface is unified under a single top level cylc command that provides access to many sub-commands and their help documentation.
Command help transcripts are printed in F and are available from the GUI Help menu.
Cylc is scriptable - the error status returned by commands can be relied on.
The cylc GUI covers the same functionality as the CLI, but it has more sophisticated suite monitoring capability. It can start and stop suites, or connect to suites that are already running; in either case, shutting down the GUI does not affect the suite itself.
Clicking on a suite in gscan, shown in Figure 13, opens a gcylc instance for it.
Cylc suites are defined by extended-INI format suite.rc files (the main file format extension is section nesting). These reside in suite definition directories that may also contain a bin directory and any other suite-related files.
Suite registration creates a run directory (under ~/cylc-run/ by default) and populates it with authentication files and a symbolic link to a suite definition directory. Cylc commands that parse suite definitions can take the file path or the suite name as input. Commands that interact with running suites have to target the suite by name.
Registration (above) also generates a suite-specific passphrase file under .service/ in the suite run directory. It is loaded by the suite server program at start-up and used to authenticate connections from client programs.
Possession of a suite’s passphrase file gives full control over it. Without it, the information avaiable to a client is determined by the suite’s public access privilege level.
For more on connection authentication, suite passphrases, and public access, see 12.9.
Run the following command to copy cylc’s example suites and register them for your own use:
Suites can be renamed by simply renaming (i.e. moving) their run directories. Make the tutorial suite names shorter, and print their locations with cylc print:
See cylc print --help for other display options.
Suite definitions can be validated to detect syntax (and other) errors:
Here’s the traditional Hello World program rendered as a cylc suite:
Cylc suites feature a clean separation of scheduling configuration, which determines when tasks are ready to run; and runtime configuration, which determines what to run (and where and how to run it) when a task is ready. In this example the [scheduling] section defines a single task called hello that triggers immediately when the suite starts up. When the task finishes the suite shuts down. That this is a dependency graph will be more obvious when more tasks are added. Under the [runtime] section the script item defines a simple inlined implementation for hello: it sleeps for ten seconds, then prints Hello World!, and exits. This ends up in a job script generated by cylc to encapsulate the task (below) and, thanks to some defaults designed to allow quick prototyping of new suites, it is submitted to run as a background job on the suite host. In fact cylc even provides a default task implementation that makes the entire [runtime] section technically optional:
(the resulting dummy task just prints out some identifying information and exits).
The text editor invoked by cylc on suite definitions is determined by cylc site and user global config files, as shown above in 7.2. Check that you have renamed the tutorial examples suites as described just above and open the Hello World suite definition in your text editor:
Alternatively, start gcylc on the suite:
and choose Suite →Edit from the menu.
The editor will be invoked from within the suite definition directory for easy access to other suite files (in this case there are none). There are syntax highlighting control files for several text editors under <cylc-dir>/etc/syntax/; see in-file comments for installation instructions.
If you’re quick enough (this example only takes 10-15 seconds to run) the cylc scan command will detect the running suite:
Note you can use the --no-detach and --debug options to cylc-run to prevent the suite from daemonizing (i.e. to make it stay attached to your terminal until it exits).
When a task is ready cylc generates a job script to run it, by default as a background jobs on the suite host. The job process ID is captured, and job output is directed to log files in standard locations under the suite run directory.
Log file locations relative to the suite run directory look like job/1/hello/01/ where the first digit is the cycle point of the task hello (for non-cycling tasks this is just ‘1’); and the final 01 is the submit number (so that job logs do not get overwritten if a job is resubmitted for any reason).
The suite shuts down automatically once all tasks have succeeded.
The cylc GUI can start and stop suites, or (re)connect to suites that are already running:
Use the tool bar Play button, or the Control →Run menu item, to run the suite again. You may want to alter the suite definition slightly to make the task take longer to run. Try right-clicking on the hello task to view its output logs. The relative merits of the three suite views - dot, text, and graph - will be more apparent later when we have more tasks. Closing the GUI does not affect the suite itself.
Suites that are currently running can be detected with command line or GUI tools:
The scan GUI is shown in Figure 13; clicking on a suite in it opens gcylc.
At run time, task instances are identified by name, which is determined entirely by the suite definition, and a cycle point which is usually a date-time or an integer:
Non-cycling tasks usually just have the cycle point 1, but this still has to be used to target the task instance with cylc commands.
Task job scripts are generated by cylc to wrap the task implementation specified in the suite definition (environment, script, etc.) in error trapping code, messaging calls to report task progress back to the suite server program, and so forth. Job scripts are written to the suite job log directory where they can be viewed alongside the job output logs. They can be accessed at run time by right-clicking on the task in the cylc GUI, or printed to the terminal:
This command can also print the suite log (and stdout and stderr for suites in daemon mode) and task stdout and stderr logs (see cylc cat-log --help). A new job script can also be generated on the fly for inspection:
Take a look at the job script generated for hello.1 during the suite run above. The custom scripting should be clearly visible toward the bottom of the file.
The hello task in the first tutorial suite defaults to running as a background job on the suite host. To submit it to the Unix at scheduler instead, configure its job submission settings as in tut/oneoff/jobsub:
Run the suite again after checking that atd is running on your system.
Cylc supports a number of different batch systems. Tasks submitted to external batch queuing systems like at, PBS, SLURM, Moab, or LoadLeveler, are displayed as submitted in the cylc GUI until they start executing.
If the --no-detach option is not used, suite stdout and stderr will be directed to the suite run directory along with the time-stamped suite log file, and task job scripts and job logs (task stdout and stderr). The default suite run directory location is $HOME/cylc-run:
The suite run database files, suite environment file, and task status files are used internally by cylc. Tasks execute in private work/ directories that are deleted automatically if empty when the task finishes. The suite share/ directory is made available to all tasks (by $CYLC_SUITE_SHARE_DIR) as a common share space. The task submission number increments from 1 if a task retries; this is used as a sub-directory of the log tree to avoid overwriting log files from earlier job submissions.
The top level run directory location can be changed in site and user config files if necessary, and the suite share and work locations can be configured separately because of the potentially larger disk space requirement.
Task job logs can be viewed by right-clicking on tasks in the gcylc GUI (so long as the task proxy is live in the suite), manually accessed from the log directory (of course), or printed to the terminal with the cylc cat-log command:
The hello task in the first two tutorial suites defaults to running on the suite host. To make it run on a remote host instead change its runtime configuration as in tut/oneoff/remote:
In general, a task remote is a user account, other than the account running the suite server program, where a task job is submitted to run. It can be on the same machine running the suite or on another machine.
A task remote account must satify several requirements:
If your username is different on the task host, you can add a User setting for the relevant host in your ~/.ssh/config. If you are unable to do so, the [[[remote]]] section also supports an owner=username item.
If you configure a task account according to the requirements cylc will invoke itself on the remote account (with a login shell by default) to create log directories, transfer any essential service files, send the task job script over, and submit it to run there by the configured batch system.
Remote task job logs are saved to the suite run directory on the task remote, not on the account running the suite. They can be retrieved by right-clicking on the task in the GUI, or to have cylc pull them back to the suite account automatically do this:
This suite will attempt to rsync job logs from the remote host each time a task job completes.
Some batch systems have considerable delays between the time when the job completes and when it writes the job logs in its normal location. If this is the case, you can configure an initial delay and retry delays for job log retrieval by setting some delays. E.g.:
Finally, if the disk space of the suite host is limited, you may want to set [[[remote]]]retrieve job logs max size=SIZE. The value of SIZE can be anything that is accepted by the --max-size=SIZE option of the rsync command. E.g.:
It is worth noting that cylc uses the existence of a job’s job.out or job.err in the local file system to indicate a successful job log retrieval. If retrieve job logs max sizeSIZE= is set and both job.out and job.err are bigger than SIZE then cylc will consider the retrieval as failed. If retry delays are specified, this will trigger some useless (but harmless) retries. If this occurs regularly, you should try the following:
To make a second task called goodbye trigger after hello finishes successfully, return to the original example, tut/oneoff/basic, and change the suite graph as in tut/oneoff/goodbye:
or to trigger it at the same time as hello,
and configure the new task’s behaviour under [runtime]:
Run tut/oneoff/goodbye and check the output from the new task:
Task names in the graph string can be qualified with a state indicator to trigger off task states other than success:
A common use of this is to automate recovery from known modes of failure:
i.e. if task goodbye fails, trigger another task that (presumably) really says goodbye.
Failure triggering generally requires use of suicide triggers as well, to remove the recovery task if it isn’t required (otherwise it would hang about indefinitely in the waiting state):
This means if goodbye fails, trigger really_goodbye; and otherwise, if goodbye succeeds, remove really_goodbye from the suite.
Try running tut/oneoff/suicide, which also configures the hello task’s runtime to make it fail, to see how this works.
The [runtime] section is actually a multiple inheritance hierarchy. Each subsection is a namespace that represents a task, or if it is inherited by other namespaces, a family. This allows common configuration to be factored out of related tasks very efficiently.
The [root] namespace provides defaults for all tasks in the suite. Here both tasks inherit script from root, which they customize with different values of the environment variable $GREETING. Note that inheritance from root is implicit; from other parents an explicit inherit = PARENT is required, as shown below.
Task families defined by runtime inheritance can also be used as shorthand in graph trigger expressions. To see this, consider two “greeter” tasks that trigger off another task foo:
If we put the common greeting functionality of greeter_1 and greeter_2 into a special GREETERS family, the graph can be expressed more efficiently like this:
i.e. if foo succeeds, trigger all members of GREETERS at once. Here’s the full suite with runtime hierarchy shown:
(Note that we recommend given ALL-CAPS names to task families to help distinguish them from task names. However, this is just a convention).
Experiment with the tut/oneoff/ftrigger1 suite to see how this works.
Tasks (or families) can also trigger off other families, but in this case we need to specify what the trigger means in terms of the upstream family members. Here’s how to trigger another task bar if all members of GREETERS succeed:
Verbose validation in this case reports:
Cylc ignores family member qualifiers like succeed-all on the right side of a trigger arrow, where they don’t make sense, to allow the two graph lines above to be combined in simple cases:
Any task triggering status qualified by -all or -any, for the members, can be used with a family trigger. For example, here’s how to trigger bar if all members of GREETERS finish (succeed or fail) and any of them them succeed:
(use of GREETERS:succeed-any by itself here would trigger bar as soon as any one member of GREETERS completed successfully). Verbose validation now begins to show how family triggers can simplify complex graphs, even for this tiny two-member family:
Experiment with tut/oneoff/ftrigger2 to see how this works.
You can style dependency graphs with an optional [visualization] section, as shown in tut/oneoff/ftrigger2:
To display the graph in an interactive viewer:
It should look like Figure 15 (with the GREETERS family node expanded on the right).
Graph styling can be applied to entire families at once, and custom “node groups” can also be defined for non-family groups.
The tasks in our examples so far have all had inlined implementation, in the suite definition, but real tasks often need to call external commands, scripts, or executables. To try this, let’s return to the basic Hello World suite and cut the implementation of the task hello out to a file hello.sh in the suite bin directory:
Make the task script executable, and change the hello task runtime section to invoke it:
If you run the suite now the new greeting from the external task script should appear in the hello task stdout log. This works because cylc automatically adds the suite bin directory to $PATH in the environment passed to tasks via their job scripts. To execute scripts (etc.) located elsewhere you can refer to the file by its full file path, or set $PATH appropriately yourself (this could be done via $HOME/.profile, which is sourced at the top of the task job script, or in the suite definition itself).
Note the use of set -e above to make the script abort on error. This allows the error trapping code in the task job script to automatically detect unforeseen errors.
So far we’ve considered non-cycling tasks, which finish without spawning a successor.
Cycling is based around iterating through date-time or integer sequences. A cycling task may run at each cycle point in a given sequence (cycle). For example, a sequence might be a set of date-times every 6 hours starting from a particular date-time. A cycling task may run for each date-time item (cycle point) in that sequence.
There may be multiple instances of this type of task running in parallel, if the opportunity arises and their dependencies allow it. Alternatively, a sequence can be defined with only one valid cycle point - in that case, a task belonging to that sequence may only run once.
Open the tut/cycling/one suite:
The difference between cycling and non-cycling suites is all in the [scheduling] section, so we will leave the [runtime] section alone for now (this will result in cycling dummy tasks). Note that the graph is now defined under a new section heading that makes each task under it have a succession of cycle points ending in 00 or 12 hours, between specified initial and final cycle points (or indefinitely if no final cycle point is given), as shown in Figure 16.
If you run this suite instances of foo will spawn in parallel out to the runahead limit, and each bar will trigger off the corresponding instance of foo at the same cycle point. The runahead limit, which defaults to a few cycles but is configurable, prevents uncontrolled spawning of cycling tasks in suites that are not constrained by clock triggers in real time operation.
Experiment with tut/cycling/one to see how cycling tasks work.
The suite above is a very simple example of a cycling date-time workflow. More generally, cylc comprehensively supports the ISO 8601 standard for date-time instants, intervals, and sequences. Cycling graph sections can be specified using full ISO 8601 recurrence expressions, but these may be simplified by assuming context information from the suite - namely initial and final cycle points. One form of the recurrence syntax looks like Rn/start-date-time/period (Rn means run n times). In the example above, if the initial cycle point is always at 00 or 12 hours then [[[T00,T12]]] could be written as [[[PT12H]]], which is short for [[[R/initial-cycle-point/PT12H/]]] - i.e. run every 12 hours indefinitely starting at the initial cycle point. It is possible to add constraints to the suite to only allow initial cycle points at 00 or 12 hours e.g.
The tut/cycling/two suite adds inter-cycle dependence to the previous example:
For any given cycle point in the sequence defined by the cycling graph section heading, bar triggers off foo as before, but now foo triggers off its own previous instance foo[-PT12H]. Date-time offsets in inter-cycle triggers are expressed as ISO 8601 intervals (12 hours in this case). Figure 17 shows how this connects the cycling graph sections together.
Experiment with this suite to see how inter-cycle triggers work. Note that the first instance of foo, at suite start-up, will trigger immediately in spite of its inter-cycle trigger, because cylc ignores dependence on points earlier than the initial cycle point. However, the presence of an inter-cycle trigger usually implies something special has to happen at start-up. If a model depends on its own previous instance for restart files, for example, then some special process has to generate the initial set of restart files when there is no previous cycle point to do it. The following section shows one way to handle this in cylc suites.
Sometimes we want to be able to run a task at the initial cycle point, but refrain from running it in subsequent cycles. We can do this by writing an extra set of dependencies that are only valid at a single date-time cycle point. If we choose this to be the initial cycle point, these will only apply at the very start of the suite.
The cylc syntax for writing this single date-time cycle point occurrence is R1, which stands for R1/no-specified-date-time/no-specified-period. This is an adaptation of part of the ISO 8601 date-time standard’s recurrence syntax (Rn/date-time/period) with some special context information supplied by cylc for the no-specified-⋆ data.
The 1 in the R1 means run once. As we’ve specified no date-time, Cylc will use the initial cycle point date-time by default, which is what we want. We’ve also missed out specifying the period - this is set by cylc to a zero amount of time in this case (as it never repeats, this is not significant).
For example, in tut/cycling/three:
This is shown in Figure 18.
Note that the time zone has been set to +1300 in this case, instead of UTC (Z) as before. If no time zone or UTC mode was set, the local time zone of your machine will be used in the cycle points.
At the initial cycle point, foo will depend on foo[-PT12H] and also on prep:
Thereafter, it will just look like e.g.:
However, in our initial cycle point example, the dependence on foo.20130807T1200+13 will be ignored, because that task’s cycle point is earlier than the suite’s initial cycle point and so it cannot run. This means that the initial cycle point dependencies for foo actually look like:
Cylc can do also do integer cycling for repeating workflows that are not date-time based.
Open the tut/cycling/integer suite, which is plotted in Figure 19.
The integer cycling notation is intended to look similar to the ISO 8601 date-time notation, but it is simpler for obvious reasons. The example suite illustrates two recurrence forms, Rn/start-point/period and Rn/period/stop-point, simplified somewhat using suite context information (namely the initial and final cycle points). The first form is used to run one special task called start at start-up, and for the main cycling body of the suite; and the second form to run another special task called stop in the final two cycles. The P character denotes period (interval) just like in the date-time notation. R/1/P2 would generate the sequence of points 1,3,5,....
Cylc has built in support for the Jinja2 template processor, which allows us to embed code in suite definitions to generate the final result seen by cylc.
The tut/oneoff/jinja2 suite illustrates two common uses of Jinja2: changing suite content or structure based on the value of a logical switch; and iteratively generating dependencies and runtime configuration for groups of related tasks:
To view the result of Jinja2 processing with the Jinja2 flag MULTI set to False:
And with MULTI set to True:
Tasks can be configured to retry a number of times if they fail. An environment variable $CYLC_TASK_TRY_NUMBER increments from 1 on each successive try, and is passed to the task to allow different behaviour on the retry:
If a task with configured retries fails, it goes into the retrying state until the next retry delay is up, then it resubmits. It only enters the failed state on a final definitive failure.
If a task with configured retries is killed (by cylc kill or via the GUI) it goes to the held state so that the operator can decide whether to release it and continue the retry sequence or to abort the retry sequence by manually resetting it to the failed state.
Experiment with tut/oneoff/retry to see how this works.
If you have read access to another user’s account (even on another host) it is possible to use cylc monitor to look at their suite’s progress without full shell access to their account. To do this, you will need to copy their suite passphrase to
(use of the host and owner names is optional here - see 12.9.2) and also retrieve the port number of the running suite from:
Once you have this information, you can run
to view the progress of their suite.
Other suite-connecting commands work in the same way; see 12.11.
Almost every feature of cylc can be tested quickly and easily with a simple dummy suite. You can write your own, or start from one of the example suites in /path/to/cylc/examples (see use of cylc import-examples above) - they all run “out the box” and can be copied and modified at will.
Cylc commands target suites via their names, which are relative path names under the suite run directory (~/cylc-run/ by default). Suites can be grouped together under sub-directories. E.g.:
Suites can be pre-registered with a name using the cylc register command. The creates the essential directory structure for the suite, and generates some service files underneath it. Otherwise, cylc run will create these files on suite start up.
Cylc suites are defined in structured, validated, suite.rc files that concisely specify the properties of, and the relationships between, the various tasks managed by the suite. This section of the User Guide deals with the format and content of the suite.rc file, including task definition. Task implementation - what’s required of the real commands, scripts, or programs that do the processing that the tasks represent - is covered in 10; and task job submission - how tasks are submitted to run - is in 11.
A cylc suite definition directory contains:
A typical example:
Suite.rc files are an extended-INI format with section nesting.
Embedded template processor expressions may also be used in the file, to programatically generate the final suite definition seen by cylc. Currently the Jinja2 template processor is supported (http://jinja.pocoo.org/docs); see 9.7 for examples. In the future cylc may provide a plug-in interface to allow use of other template engines too.
The following defines legal suite.rc syntax:
Suites that embed Jinja2 code (see 9.7) must process to raw suite.rc syntax.
Cylc has native support for suite.rc include-files, which may help to organize large suites. Inclusion boundaries are completely arbitrary - you can think of include-files as chunks of the suite.rc file simply cut-and-pasted into another file. Include-files may be included multiple times in the same file, and even nested. Include-file paths can be specified portably relative to the suite definition directory, e.g.:
Editing Temporarily Inlined Suites Cylc’s native file inclusion mechanism supports optional inlined editing:
The suite will be split back into its constituent include-files when you exit the edit session. While editing, the inlined file becomes the official suite definition so that changes take effect whenever you save the file. See cylc prep edit --help for more information.
Include-Files via Jinja2 Jinja2 (9.7) also has template inclusion functionality.
Cylc comes with syntax files for a number of text editors:
Refer to comments at the top of each file to see how to use them.
Cylc suite.rc files consist of a suite title and description followed by configuration items grouped under several top level section headings:
Cylc suite.rc files are automatically validated against a specification that defines all legal entries, values, options, and defaults. This detects formatting errors, typographic errors, illegal items and illegal values prior to run time. Some values are complex strings that require further parsing by cylc to determine their correctness (this is also done during validation). All legal entries are documented in the Suite.rc Reference (A).
The validator reports the line numbers of detected errors. Here’s an example showing a section heading with a missing right bracket:
If the suite.rc file uses include-files cylc view will show an inlined copy of the suite with correct line numbers (you can also edit suites in a temporarily inlined state with cylc edit --inline).
Validation does not check the validity of chosen batch systems.
The [scheduling] section of a suite.rc file defines the relationships between tasks in a suite - the information that allows cylc to determine when tasks are ready to run. The most important component of this is the suite dependency graph. Cylc graph notation makes clear textual graph representations that are very concise because sections of the graph that repeat at different hours of the day, say, only have to be defined once. Here’s an example with dependencies that vary depending on the particular cycle point:
Figure 20 shows the complete suite.rc listing alongside the suite graph. This is a complete, valid, runnable suite (it will use default task runtime properties such as script).

Multiline graph strings may contain:
Suite dependency graphs can be broken down into pairs in which the left side (which may be a single task or family, or several that are conditionally related) defines a trigger for the task or family on the right. For instance the “word graph” C triggers off B which triggers off A can be deconstructed into pairs C triggers off B and B triggers off A. In this section we use only the default trigger type, which is to trigger off the upstream task succeeding; see 9.3.5 for other available triggers.
In the case of cycling tasks, the triggers defined by a graph string are valid for cycle points matching the list of hours specified for the graph section. For example this graph:
implies that B triggers off A for cycle points in which the hour matches 00 or 12.
To define inter-cycle dependencies, attach an offset indicator to the left side of a pair:
This means B[time] triggers off A[time-PT12H] (12 hours before) for cycle points with hours matching 00 or 12. time is implicit because this keeps graphs clean and concise, given that the majority of tasks will typically depend only on others with the same cycle point. Cycle point offsets can only appear on the left of a pair, because a pairs define triggers for the right task at cycle point time. However, A => B[-PT6H], which is illegal, can be reformulated as a future trigger A[+PT6H] => B (see 9.3.5.11). It is also possible to combine multiple offsets within a cycle point offset e.g.
This means that B[Time] triggers off A[time-P1D-PT12H] (1 day and 12 hours before).
Triggers can be chained together. This graph:
is equivalent to this:
Each trigger in the graph must be unique but the same task can appear in multiple pairs or chains. Separately defined triggers for the same task have an AND relationship. So this:
is equivalent to this:
In summary, the branching tree structure of a dependency graph can be partitioned into lines (in the suite.rc graph string) of pairs or chains, in any way you like, with liberal use of internal white space and comments to make the graph structure as clear as possible.
Splitting Up Long Graph Lines It is not necessary to use the general line continuation marker \ to split long graph lines. Just break at dependency arrows, or split long chains into smaller ones. This graph:
is equivalent to this:
and also to this:
A suite definition can contain multiple graph strings that are combined to generate the final graph.
One-off (Non-Cycling) Figure 21 shows a small suite of one-off non-cycling tasks; these all share a single cycle point (1) and don’t spawn successors (once they’re all finished the suite just exits). The integer 1 attached to each graph node is just an arbitrary label here.
Cycling Graphs For cycling tasks the graph section heading defines a sequence of cycle points for which the subsequent graph section is valid. Figure 22 shows a small suite of cycling tasks.
Graph section headings define recurrence expressions, the graph within a graph section heading defines a workflow at each point of the recurrence. For example in the following scenario:
T06 means ”Run every day starting at 06:00 after the initial cycle point”. Cylc allows you to start (or end) at any particular time, repeat at whatever frequency you like, and even optionally limit the number of repetitions.
Graph section heading can also be used with integer cycling see 9.3.4.8.
Syntax Rules Date-time cycling information is made up of a starting date-time, an interval, and an optional limit.
The time is assumed to be in the local time zone unless you set [cylc]cycle point time zone or [cylc]UTC mode. The calendar is assumed to be the proleptic Gregorian calendar unless you set [scheduling]cycling mode.
The syntax for representations is based on the ISO 8601 date-time standard. This includes the representation of date-time, interval. What we define for cylc’s cycling syntax is our own optionally-heavily-condensed form of ISO 8601 recurrence syntax. The most common full form is: R[limit?]/[date-time]/[interval]. However, we allow omitting information that can be guessed from the context (rules below). This means that it can be written as:
with example graph headings for each form being:
Note that T00 is an example of [date-time], with an inferred 1 day period and no limit.
Where some or all date-time information is omitted, it is inferred to be relative to the initial date-time cycle point. For example, T00 by itself would mean the next occurrence of midnight that follows, or is, the initial cycle point. Entering +PT6H would mean 6 hours after the initial cycle point. Entering -P1D would mean 1 day before the initial cycle point. Entering no information for the date-time implies the initial cycle point date-time itself.
Where the interval is omitted and some (but not all) date-time information is omitted, it is inferred to be a single unit above the largest given specific date-time unit. For example, the largest given specific unit in T00 is hours, so the inferred interval is 1 day (daily), P1D.
Where the limit is omitted, unlimited cycling is assumed. This will be bounded by the final cycle point’s date-time if given.
Another supported form of ISO 8601 recurrence is: R[limit?]/[interval]/[date-time]. This form uses the date-time as the end of the cycling sequence rather than the start. For example, R3/P5D/20140430T06 means:
This kind of form can be used for specifying special behaviour near the end of the suite, at the final cycle point’s date-time. We can also represent this in cylc with a collapsed form:
So, for example, you can write:
Referencing The Initial And Final Cycle Points For convenience the caret and dollar symbols may be used as shorthand for the initial and final cycle points. Using this shorthand you can write:
Note that there can be multiple ways to write the same headings, for instance the following all run once at the final cycle point:
Excluding Dates Date-times can be excluded from a recurrence by an exclamation mark for example [[[PT1D!20000101 ]]] means run daily except on the first of January 2000.
This syntax can be used to exclude one or multiple date-times from a recurrence. Multiple date-times are excluded using the syntax [[[PT1D!(20000101,20000102,...) ]]]. All date-times listed within the parentheses after the exclamation mark will be excluded. Note that the ^ and $ symbols (shorthand for the initial and final cycle points) are both date-times so [[[T12!$-PT1D ]]] is valid.
If using a run limit in combination with an exclusion, the heading might not run the number of times specified in the limit. For example in the following suite foo will only run once as its second run has been excluded.
Advanced exclusion syntax In addition to excluding isolated date-time points or lists of date-time points from recurrences, exclusions themselves may be date-time recurrence sequences. Any partial date-time or sequence given after the exclamation mark will be excluded from the main sequence.
For example, partial date-times can be excluded using the syntax:
It is also valid to use sequences for exclusions. For example:
You can combine exclusion sequences and single point exclusions within a comma separated list enclosed in parentheses:
How Multiple Graph Strings Combine For a cycling graph with multiple validity sections for different hours of the day, the different sections add to generate the complete graph. Different graph sections can overlap (i.e. the same hours may appear in multiple section headings) and the same tasks may appear in multiple sections, but individual dependencies should be unique across the entire graph. For example, the following graph defines a duplicate prerequisite for task C:
This does not affect scheduling, but for the sake of clarity and brevity the graph should be written like this:
Advanced Examples The following examples show the various ways of writing graph headings in cylc.
Advanced Starting Up Dependencies that are only valid at the initial cycle point can be written using the R1 notation (e.g. as in 7.23.3. For example:
In the example above, R1 implies R1/20130808T00, so prep only runs once at that cycle point (the initial cycle point). At that cycle point, foo will have a dependence on prep - but not at subsequent cycle points.
However, it is possible to have a suite that has multiple effective initial cycles - for example, one starting at T00 and another starting at T12. What if they need to share an initial task?
Let’s suppose that we add the following section to the suite example above:
We’ll also say that there should be a starting dependence between prep and our new task baz - but we still want to have a single prep task, at a single cycle.
We can write this using a special case of the task[-interval] syntax - if the interval is null, this implies the task at the initial cycle point.
For example, we can write our suite like 23.

This neatly expresses what we want - a task running at the initial cycle point that has one-off dependencies with other task sets at different cycles.

A different kind of requirement is displayed in Figure 24. Usually, we want to specify additional tasks and dependencies at the initial cycle point. What if we want our first cycle point to be entirely special, with some tasks missing compared to subsequent cycle points?
In Figure 24, bar will not be run at the initial cycle point, but will still run at subsequent cycle points. [[[+PT6H/PT6H]]] means start at +PT6H (6 hours after the initial cycle point) and then repeat every PT6H (6 hours).
Some suites may have staggered start-up sequences where different tasks need running once but only at specific cycle points, potentially due to differing data sources at different cycle points with different possible initial cycle points. To allow this cylc provides a min( ) function that can be used as follows:
In this example the initial cycle point is 20100101T03, so the prep1 task will run once at 20100101T12 and the prep2 task will run once at 20100101T06 as these are the first cycle points after the initial cycle point in the respective min( ) entries.
Integer Cycling In addition to non-repeating and date-time cycling workflows, cylc can do integer cycling for repeating workflows that are not date-time based.
To construct an integer cycling suite, set [scheduling]cycling mode = integer, and specify integer values for the initial and (optional) final cycle points. The notation for intervals, offsets, and recurrences (sequences) is similar to the date-time cycling notation, except for the simple integer values.
The full integer recurrence expressions supported are:
But, as for date-time cycling, sequence start and end points can be omitted where suite initial and final cycle points can be assumed. Some examples:
Example The tutorial illustrates integer cycling in 7.23.4, and <cylc-dir>/etc/examples/satellite/ is a self-contained example of a realistic use for integer cycling. It simulates the processing of incoming satellite data: each new dataset arrives after a random (as far as the suite is concerned) interval, and is labeled by an arbitrary (as far as the suite is concerned) ID in the filename. A task called get_data at the top of the repeating workflow waits on the next dataset and, when it finds one, moves it to a cycle-point-specific shared workspace for processing by the downstream tasks. When get_data.1 finishes, get_data.2 triggers and begins waiting for the next dataset at the same time as the downstream tasks in cycle point 1 are processing the first one, and so on. In this way multiple datasets can be processed at once if they happen to come in quickly. A single shutdown task runs at the end of the final cycle to collate results. The suite graph is shown in Figure 25.
Advanced Integer Cycling Syntax The same syntax used to reference the initial and final cycle points (introduced in 9.3.4.2) for use with date-time cycling can also be used for integer cycling. For example you can write:
Likewise the syntax introduced in 9.3.4.3 for excluding a particular point from a recurrence also works for integer cycling. For example:
Multiple integer exclusions are also valid in the same way as the syntax in 9.3.4.3. Integer exclusions may be a list of single integer points, an integer sequence, or a combination of both:
Trigger type, indicated by :type after the upstream task (or family) name, determines what kind of event results in the downstream task (or family) triggering.
Success Triggers The default, with no trigger type specified, is to trigger off the upstream task succeeding:
For consistency and completeness, however, the success trigger can be explicit:
Failure Triggers To trigger off the upstream task reporting failure:
Suicide triggers can be used to remove task B here if A does not fail, see 9.3.5.8.
Start Triggers To trigger off the upstream task starting to execute:
This can be used to trigger tasks that monitor other tasks once they (the target tasks) start executing. Consider a long-running forecast model, for instance, that generates a sequence of output files as it runs. A postprocessing task could be launched with a start trigger on the model (model:start => post) to process the model output as it becomes available. Note, however, that there are several alternative ways of handling this scenario: both tasks could be triggered at the same time (foo => model & post), but depending on external queue delays this could result in the monitoring task starting to execute first; or a different postprocessing task could be triggered off a message output for each data file (model:out1 => post1 etc.; see 9.3.5.5), but this may not be practical if the number of output files is large or if it is difficult to add cylc messaging calls to the model.
Finish Triggers To trigger off the upstream task succeeding or failing, i.e. finishing one way or the other:
Message Triggers Tasks can also trigger off custom output messages. These must be registered in the [runtime] section of the emitting task, and reported using the cylc message command in task scripting. The graph trigger notation refers to the item name of the registered output message. The example suite <cylc-dir>/etc/examples/message-triggers illustrates message triggering.
Job Submission Triggers It is also possible to trigger off a task submitting, or failing to submit:
A possible use case for submit-fail triggers: if a task goes into the submit-failed state, possibly after several job submission retries, another task that inherits the same runtime but sets a different job submission method and/or host could be triggered to, in effect, run the same job on a different platform.
Conditional Triggers AND operators (&) can appear on both sides of an arrow. They provide a concise alternative to defining multiple triggers separately:
OR operators (|) which result in true conditional triggers, can only appear on the left,2
Forecasting suites typically have simple conditional triggering requirements, but any valid conditional expression can be used, as shown in Figure 26 (conditional triggers are plotted with open arrow heads).
Suicide Triggers Suicide triggers take tasks out of the suite. This can be used for automated failure recovery. The suite.rc listing and accompanying graph in Figure 27 show how to define a chain of failure recovery tasks that trigger if they’re needed but otherwise remove themselves from the suite (you can run the AutoRecover.async example suite to see how this works). The dashed graph edges ending in solid dots indicate suicide triggers, and the open arrowheads indicate conditional triggers as usual. Suicide triggers are ignored by default in the graph view, unless you toggle them on with View -¿ Options -¿ Ignore Suicide Triggers.

Note that multiple suicide triggers combine in the same way as other triggers, so this:
is equivalent to this:
i.e. both foo and bar must succeed for baz to be taken out of the suite. If you really want a task to be taken out if any one of several events occurs then be careful to write it that way:
A word of warning on the meaning of “bare suicide triggers”. Consider the following suite:
Task bar has a suicide trigger but no normal prerequisites (a suicide trigger is not a task triggering prerequisite, it is a task removal prerequisite) so this is entirely equivalent to:
In other words both tasks will trigger immediately, at the same time, and then bar will be removed if foo succeeds.
If an active task proxy (currently in the submitted or running states) is removed from the suite by a suicide trigger, a warning will be logged.
Family Triggers Families defined by the namespace inheritance hierarchy ( 9.4) can be used in the graph trigger whole groups of tasks at the same time (e.g. forecast model ensembles and groups of tasks for processing different observation types at the same time) and for triggering downstream tasks off families as a whole. Higher level families, i.e. families of families, can also be used, and are reduced to the lowest level member tasks. Note that tasks can also trigger off individual family members if necessary.
To trigger an entire task family at once:
This is equivalent to:
To trigger other tasks off families we have to specify whether to triggering off all members starting, succeeding, failing, or finishing, or off any members (doing the same). Legal family triggers are thus:
Here’s how to trigger downstream processing after if one or more family members succeed, but only after all members have finished (succeeded or failed):
Writing Efficient Inter-Family Triggering While cylc allows writing dependencies between two families it is important to consider the number of dependencies this will generate. In the following example, each member of FAM2 has dependencies pointing at all the members of FAM1.
Expanding this out, you generate N ⋆ M dependencies, where N is the number of members of FAM1 and M is the number of members of FAM2. This can result in high memory use as the number of members of these families grows, potentially rendering the suite impractical for running on some systems.
You can greatly reduce the number of dependencies generated in these situations by putting dummy tasks in the graphing to represent the state of the family you want to trigger off. For example, if FAM2 should trigger off any member of FAM1 succeeding you can create a dummy task FAM1_succeed_any_marker and place a dependency on it as follows:
This graph generates only N + M dependencies, which takes significantly less memory and CPU to store and evaluate.
Inter-Cycle Triggers Typically most tasks in a suite will trigger off others in the same cycle point, but some may depend on others with other cycle points. This notably applies to warm-cycled forecast models, which depend on their own previous instances (see below); but other kinds of inter-cycle dependence are possible too.3 Here’s how to express this kind of relationship in cylc:
inter-cycle and trigger type (or message trigger) notation can be combined:
At suite start-up inter-cycle triggers refer to a previous cycle point that does not exist. This does not cause the dependent task to wait indefinitely, however, because cylc ignores triggers that reach back beyond the initial cycle point. That said, the presence of an inter-cycle trigger does normally imply that something special has to happen at start-up. If a model depends on its own previous instance for restart files, for instance, then an initial set of restart files has to be generated somehow or the first model task will presumably fail with missing input files. There are several ways to handle this in cylc using different kinds of one-off (non-cycling) tasks that run at suite start-up. They are illustrated in the Tutorial (7.23.2); to summarize here briefly:
R1, or R1/date-time tasks are the recommended way to specify unusual start up conditions. They allow you to specify a clean distinction between the dependencies of initial cycles and the dependencies of the subsequent cycles.
Initial tasks can be used for real model cold-start processes, whereby a warm-cycled model at any given cycle point can in principle have its inputs satisfied by a previous instance of itself, or by an initial task with (nominally) the same cycle point.
In effect, the R1 task masquerades as the previous-cycle-point trigger of its associated cycling task. At suite start-up initial tasks will trigger the first cycling tasks, and thereafter the inter-cycle trigger will take effect.
If a task has a dependency on another task in a different cycle point, the dependency can be written using the [offset] syntax such as [-PT12H] in foo[-PT12H] => foo. This means that foo at the current cycle point depends on a previous instance of foo at 12 hours before the current cycle point. Unlike the cycling section headings (e.g. [[[T00,T12]]]), dependencies assume that relative times are relative to the current cycle point, not the initial cycle point.
However, it can be useful to have specific dependencies on tasks at or near the initial cycle point. You can switch the context of the offset to be the initial cycle point by using the caret symbol: ^.
For example, you can write foo[^] to mean foo at the initial cycle point, and foo[^+PT6H] to mean foo 6 hours after the initial cycle point. Usually, this kind of dependency will only apply in a limited number of cycle points near the start of the suite, so you may want to write it in R1-based cycling sections. Here’s the example inter-cycle R1 suite from above again.
You can see there is a dependence on the initial R1 task prep for foo at the first T00 cycle point, and at the first T12 cycle point. Thereafter, foo just depends on its previous (12 hours ago) instance.
Finally, it is also possible to have a dependency on a task at a specific cycle point.
However, in a long running suite, a repeating cycle should avoid having a dependency on a task with a specific cycle point (including the initial cycle point) - as it can currently cause performance issue. In the following example, all instances of qux will depend on baz.20200101, which will never be removed from the task pool.:
Special Sequential Tasks Tasks that depend on their own previous-cycle instance can be declared as sequential:
The sequential declaration is deprecated however, in favour of explicit inter-cycle triggers which clearly expose the same scheduling behaviour in the graph:
The sequential declaration is arguably convenient in one unusual situation though: if a task has a non-uniform cycling sequence then multiple explicit triggers,
can be replaced by a single sequential declaration,
Future Triggers Cylc also supports inter-cycle triggering off tasks “in the future” (with respect to cycle point - which has no bearing on wall-clock job submission time unless the task has a clock trigger):
Future triggers present a problem at suite shutdown rather than at start-up. Here, B at the final cycle point wants to trigger off an instance of A that will never exist because it is beyond the suite stop point. Consequently Cylc prevents tasks from spawning successors that depend on other tasks beyond the final point.
Clock Triggers In addition to depending on other tasks (and on external events - see 9.3.5.16) tasks can depend on the wall clock: specifically, they can trigger off a wall clock time expressed as an offset from their own cycle point:
Here, foo[2015-08-23T00] would trigger (other dependencies allowing) when the wall clock time reaches 2015-08-23T02. Clock-trigger offsets are normally positive, to trigger some time after the wall-clock time is equal to task cycle point.
Clock-triggers have no effect on scheduling if the suite is running sufficiently far behind the clock (e.g. after a delay, or because it is processing archived historical data) that the trigger times, which are relative to task cycle point, have already passed.
Clock-Expire Triggers Tasks can be configured to expire - i.e. to skip job submission and enter the expired state - if they are too far behind the wall clock when they become ready to run, and other tasks can trigger off this. As a possible use case, consider a cycling task that copies the latest of a set of files to overwrite the previous set: if the task is delayed by more than one cycle there may be no point in running it because the freshly copied files will just be overwritten immediately by the next task instance as the suite catches back up to real time operation. Clock-expire tasks are configured like clock-trigger tasks, with a date-time offset relative to cycle point (A.4.11.2). The offset should be positive to make the task expire if the wall-clock time has gone beyond the cycle point. Triggering off an expired task typically requires suicide triggers to remove the workflow that runs if the task has not expired. Here a task called copy expires, and its downstream workflow is skipped, if it is more than one day behind the wall-clock (see also etc/examples/clock-expire):
External Triggers In addition to depending on other tasks (and on the wall clock - see 9.3.5.14) tasks can trigger off events reported by an external system. For example, an external process could detect incoming data on an ftp server, and then notify a suite containing a task to retrieve the new data for processing. This is an alternative to long-running tasks that poll for external events.
Note that cylc does not currently support triggering off “filesystem events” (e.g. inotify on Linux). However, external watcher processes can use filesystem events to detect triggering conditions, if that is appropriate, before notifying a suite with our general external event system.
The external triggering process must call cylc ext-trigger with the name of the target suite, the message that identifies this type of event to the suite, and an ID that distinguishes this particular event instance from others (the name of the target task or its current cycle point is not required). The event ID is just an arbitary string to cylc, but it typically identifies the filename(s) of the latest dataset in some way. When the suite server program receives the external event notification it will trigger the next instance of any task waiting on that trigger (whatever its cycle point) and then broadcast (see 12.23) the event ID to the cycle point of the triggered task as $CYLC_EXT_TRIGGER_ID. Downstream tasks with the same cycle point therefore know the new event ID too and can use it, if they need to, to identify the same new dataset. In this way a whole workflow can be associated with each new dataset, and multiple datasets can be processed in parallel if they happen to arrive in quick succession.
An externally-triggered task must register the event it waits on in the suite scheduling section:
Then, each time a new dataset arrives the external detection system should notify the suite like this:
where “sat-proc” is the suite name and “passX12334a” is the ID string for the new event. The suite passphrase must be installed on triggering account.
Note that only one task in a suite can trigger off a particular external message. Other tasks can trigger off the externally triggered task as required, of course.
<cylc-dir>/etc/examples/satellite/ext-triggers/suite.rc is a working example of a simulated satellite processing suite.
External triggers are not normally needed in date-time cycling suites driven by real time data that comes in at regular intervals. In these cases a data retrieval task can be clock-triggered (and have appropriate retry intervals supplied) to submit at the expected data arrival time, so little time if any is wasted in polling. However, if the arrival time of the cycle-point-specific data is highly variable, external triggering may be used with the cycle point embedded in the message:
Once the variable-length waiting is finished, an external detection system should notify the suite like this:
where “data-proc” is the suite name, the cycle point has replaced the variable in the trigger string, and “passX12334a” is the ID string for the new event. The suite passphrase must be installed on the triggering account. In this case, the event will trigger for the second cycle point but not the first because of the cycle-point matching.
Warm-cycled forecast models generate restart files, e.g. model background fields, to initialize the next forecast. This kind of dependence requires an inter-cycle trigger:
If your model is configured to write out additional restart files to allow one or more cycle points to be skipped in an emergency do not represent these potential dependencies in the suite graph as they should not be used under normal circumstances. For example, the following graph would result in task A erroneously triggering off A[T-24] as a matter of course, instead of off A[T-6], because A[T-24] will always be finished first:
A graph trigger pair like foo => bar determines the existence and prerequisites (dependencies) of the downstream task bar, for the cycle points defined by the associated graph section heading. In general it does not say anything about the dependencies or existence of the upstream task foo. However if the trigger has no cycle point offset Cylc will infer that bar must exist at the same cycle points as foo. This is a convenience to allow this:
to be written as shorthand for this:
(where foo by itself means <nothing> => foo, i.e. the task exists at these cycle points but has no prerequisites - although other prerequisites may be defined for it in other parts of the graph).
Cylc does not infer the existence of the upstream task in offset triggers like foo[-P1D]=> bar because, as explained in Section L.5, a typo in the offset interval should generate an error rather than silently creating tasks on an erroneous cycling sequence.
As a result you need to be careful not to define inter-cycle dependencies that cannot be satisfied at run time. Suite validation catches this kind of error if the existence of the cycle offset task is not defined anywhere at all:
To fix this, use another line in the graph to tell Cylc to define foo at each cycle point:
But validation does not catch this kind of error if the offset task is defined only on a different cycling sequence:
This suite will validate OK, but it will stall at runtime with bar waiting on foo[-P1Y] at the intermediate years where it does not exist. The offset [-P1Y] is presumably an error (it should be [-P2Y]), or else another graph line is needed to generate foo instances on the yearly sequence:
Similarly the following suite will validate OK, but it will stall at runtime with bar waiting on foo[-P1Y] in every cycle point, when only a single instance of it exists, at the initial cycle point:
Note that cylc graph will display un-satisfiable inter-cycle dependencies as “ghost nodes”. Figure 28 is a screenshot of cylc graph displaying the above example with the un-satisfiable task (foo) displayed as a “ghost node”.
The [runtime] section of a suite definition configures what to execute (and where and how to execute it) when each task is ready to run, in a multiple inheritance hierarchy of namespaces culminating in individual tasks. This allows all common configuration detail to be factored out and defined in one place.
Any namespace can configure any or all of the items defined in the Suite.rc Reference (A).
Namespaces that do not explicitly inherit from others automatically inherit from the root namespace (below).
Nested namespaces define task families that can be used in the graph as convenient shorthand for triggering all member tasks at once, or for triggering other tasks off all members at once - see 9.3.5.9. Nested namespaces can be progressively expanded and collapsed in the dependency graph viewer, and in the gcylc graph and text views. Only the first parent of each namespace (as for single-inheritance) is used for suite visualization purposes.
Namespace names may contain letters, digits, underscores, and hyphens.
Note that task names need not be hardwired into task implementations because task and suite identity can be extracted portably from the task execution environment supplied by the suite server program (9.4.7) - then to rename a task you can just change its name in the suite definition.
The root namespace, at the base of the inheritance hierarchy, provides default configuration for all tasks in the suite. Most root items are unset by default, but some have default values sufficient to allow test suites to be defined by dependency graph alone. The script item, for example, defaults to code that prints a message then sleeps for between 1 and 15 seconds and exits. Default values are documented with each item in A. You can override the defaults or provide your own defaults by explicitly configuring the root namespace.
If a namespace section heading is a comma-separated list of names then the subsequent configuration applies to each list member. Particular tasks can be singled out at run time using the $CYLC_TASK_NAME variable.
As an example, consider a suite containing an ensemble of closely related tasks that each invokes the same script but with a unique argument that identifies the calling task name:
For large ensembles Jinja2 template processing can be used to automatically generate the member names and associated dependencies (see 9.7).
The following listing of the inherit.single.one example suite illustrates basic runtime inheritance with single parents.
If a namespace inherits from multiple parents the linear order of precedence (which namespace overrides which) is determined by the so-called C3 algorithm used to find the linear method resolution order for class hierarchies in Python and several other object oriented programming languages. The result of this should be fairly obvious for typical use of multiple inheritance in cylc suites, but for detailed documentation of how the algorithm works refer to the official Python documentation here: http://www.python.org/download/releases/2.3/mro/.
The inherit.multi.one example suite, listed here, makes use of multiple inheritance:
cylc get-suite-config provides an easy way to check the result of inheritance in a suite. You can extract specific items, e.g.:
or use the --sparse option to print entire namespaces without obscuring the result with the dense runtime structure obtained from the root namespace:
Suite Visualization And Multiple Inheritance The first parent inherited by a namespace is also used as the collapsible family group when visualizing the suite. If this is not what you want, you can demote the first parent for visualization purposes, without affecting the order of inheritance of runtime properties:
The linear precedence order of ancestors is computed for each namespace using the C3 algorithm. Then any runtime items that are explicitly configured in the suite definition are “inherited” up the linearized hierarchy for each task, starting at the root namespace: if a particular item is defined at multiple levels in the hierarchy, the level nearest the final task namespace takes precedence. Finally, root namespace defaults are applied for every item that has not been configured in the inheritance process (this is more efficient than carrying the full dense namespace structure through from root from the beginning).
The task execution environment contains suite and task identity variables provided by the suite server program, and user-defined environment variables. The environment is explicitly exported (by the task job script) prior to executing the task script (see 11).
Suite and task identity are exported first, so that user-defined variables can refer to them. Order of definition is preserved throughout so that variable assignment expressions can safely refer to previously defined variables.
Additionally, access to cylc itself is configured prior to the user-defined environment, so that variable assignment expressions can make use of cylc utility commands:
User Environment Variables A task’s user-defined environment results from its inherited [[[environment]]] sections:
This results in a task foo with SHAPE=circle, COLOR=blue, and TEXTURE=rough in its environment.
Overriding Environment Variables When you override inherited namespace items the original parent item definition is replaced by the new definition. This applies to all items including those in the environment sub-sections which, strictly speaking, are not “environment variables” until they are written, post inheritance processing, to the task job script that executes the associated task. Consequently, if you override an environment variable you cannot also access the original parent value:
The compressed variant of this, COLOR = dark-$COLOR, is also in error for the same reason. To achieve the desired result you must use a different name for the parent variable:
Task Job Script Variables These are variables that can be referenced (but should not be modified) in a task job script.
The task job script may export the following environment variables:
There are also some global shell variables that may be defined in the task job script (but not exported to the environment). These include:
Suite Share Directories A suite share directory is created automatically under the suite run directory as a share space for tasks. The location is available to tasks as $CYLC_SUITE_SHARE_DIR. In a cycling suite, output files are typically held in cycle point sub-directories of the suite share directory.
The top level share and work directory (below) location can be changed (e.g. to a large data area) by a global config setting (see B.9.1.2).
Task Work Directories Task job scripts are executed from within work directories created automatically under the suite run directory. A task can get its own work directory from $CYLC_TASK_WORK_DIR (or simply $PWD if it does not cd elsewhere at runtime). By default the location contains task name and cycle point, to provide a unique workspace for every instance of every task. This can be overridden in the suite definition, however, to get several tasks to share the same work directory (see A.5.1.9).
The top level work and share directory (above) location can be changed (e.g. to a large data area) by a global config setting (see B.9.1.2).
Environment Variable Evaluation Variables in the task execution environment are not evaluated in the shell in which the suite is running prior to submitting the task. They are written in unevaluated form to the job script that is submitted by cylc to run the task (10.1) and are therefore evaluated when the task begins executing under the task owner account on the task host. Thus $HOME, for instance, evaluates at run time to the home directory of task owner on the task host.
Tasks can use $CYLC_SUITE_DEF_PATH to access suite files on the task host, and the suite bin directory is automatically added $PATH. If a remote suite definition directory is not specified the local (suite host) path will be assumed with the local home directory, if present, swapped for literal $HOME for evaluation on the task host.
If a task declares an owner other than the suite owner and/or a host other than the suite host, cylc will use non-interactive ssh to execute the task on the owner@host account by the configured batch system:
For this to work:
To learn how to give remote tasks access to cylc, see 12.3.
Tasks running on the suite host under another user account are treated as remote tasks.
Remote hosting, like all namespace settings, can be declared globally in the root namespace, or per family, or for individual tasks.
Dynamic Host Selection Instead of hardwiring host names into the suite definition you can specify a shell command that prints a hostname, or an environment variable that holds a hostname, as the value of the host config item. See A.5.1.12.1.
Remote Task Log Directories Task stdout and stderr streams are written to log files in a suite-specific sub-directory of the suite run directory, as explained in 11.2. For remote tasks the same directory is used, but on the task host. Remote task log directories, like local ones, are created on the fly, if necessary, during job submission.
The visualization section of a suite definition is used to configure suite graphing, principally graph node (task) and edge (dependency arrow) style attributes. Tasks can be grouped for the purpose of applying common style attributes. See A for details.
Nested families from the runtime inheritance hierarchy can be expanded and collapsed in suite graphs and the gcylc graph view. All families are displayed in the collapsed state at first, unless [visualization]collapsed families is used to single out specific families for initial collapsing.
In the gcylc graph view, nodes outside of the main graph (such as the members of collapsed families) are plotted as rectangular nodes to the right if they are doing anything interesting (submitted, running, failed).
Figure 29 illustrates successive expansion of nested task families in the namespaces example suite.






Cylc can automatically generate tasks and dependencies by expanding parameterized task names over lists of parameter values. Uses for this include:
Note that this can be done with Jinja2 loops too (Section 9.7) but parameterization is much cleaner (nested loops can seriously reduce the clarity of a suite definition).
Parameter values can be lists of strings, or lists of integers and integer ranges (with inclusive bounds). Numeric values in a list of strings are considered strings. It is not possible to mix strings with integer ranges.
For example:
Then angle brackets denote use of these parameters throughout the suite definition. For the values above, this parameterized name:
expands to these concrete task names:
and this parameterized name:
expands to these concrete task names:
By default, to avoid any ambiguity, the parameter name appears in the expanded task names for integer values, but not for string values. For example, model_run1 for run = 1, but proc_ship for obs = ship. However, the default expansion templates can be overridden if need be:
(See A.3.12 for more on the string template syntax.)
Any number of parameters can be used at once. This parameterization:
expands to these tasks names:
Here’s a simple but complete example suite:
The result, post parameter expansion, is this:
Here’s a more complex graph using two parameters ([runtime] omitted):
Figure 30 shows the result as visualized by cylc graph.
Zero-Padded Integer Values Integer parameter values are given a default template for generating task suffixes that are zero-padded according to the longest size of their values. For example, the default template for p = 9..10 would be _p%(p)02d, so that foo<p> would become foo_p09, foo_p10. If negative values are present in the parameter list, the default template will include the sign. For example, the default template for p = -1..1 would be _p%(p)+02d, so that foo<p> would become foo_p-1, foo_p+0, foo_p+1.
To get thicker padding and/or alternate suffixes, use a template. E.g.:
Parameters as Full Task Names Parameter values can be used as full task names, but the default template should be overridden to remove the initial underscore. For example:
Parameter values are passed as environment variables to tasks generated by parameter expansion. For example, if we have:
Then task model_run2_ship would get the following standard environment variables:
These variables allow tasks to determine which member of a parameterized group they are, and so to vary their behaviour accordingly.
You can also define custom variables and string templates for parameter value substitution. For example, if we add this to the above configuration:
Then task model_run2_ship would get the following custom environment variables:
Specific parameter values can be singled out in the graph and under [runtime] with the notation <p=5> (for example). Here’s how to make a special task trigger off just the first of a set of model runs:
The parameter notation does not currently support partial range selection such as foo<p=5..10>, but you can achieve the same result by defining a second parameter that covers the partial range and giving it the same expansion template as the full-range parameter. For example:
A negative offset notation <NAME-1> is interpreted as the previous value in the ordered list of parameter values, while a positive offset is interpreted as the next value. For example, to split a model run into multiple steps with each step depending on the previous one, either of these graphs:
expands to:
And this graph:
expands to:
However, a quirk in the current system means that you should avoid mixing conditional logic in these statements. For example, the following will do the unexpected:
currently expands to:
For the time being, writing out the logic explicitly will give you the correct graph.
Task family members can be generated by parameter expansion:
Family names can be parameterized too, just like task names:
As described in Section 9.3.5.9 family names can be used to trigger all members at once:
or to trigger off all members:
or to trigger off any members:
If the members of FAMILY were generated with parameters, you can also trigger them all at once with parameter notation:
Similarly, to trigger off all members:
Family names are still needed in the graph, however, to succinctly express “succeed-any” triggering semantics, and all-to-all or any-to-all triggering:
(Direct all-to-all and any-to-all family triggering is not recommended for efficiency reasons though - see Section 9.3.5.10).
For family member-to-member triggering use parameterized members. For example, if family OBS_GET has members get<obs> and family OBS_PROC has members proc<obs> then this graph:
expands to:
Two ways of constructing cycling systems are described and contrasted in Section 5. For most purposes use of a proper cycling workflow is recommended, wherein cylc incrementally generates the date-time sequence and extends the workflow, potentially indefinitely, at run time. For smaller systems of finite duration, however, parameter expansion can be used to generate a sequence of pre-defined tasks as a proxy for cycling.
Here’s a cycling workflow of two-monthly model runs for one year, with previous-instance model dependence (e.g. for model restart files):
And here’s how to do the same thing with parameterized tasks:
The two workflows are shown together in Figure 31. They both achieve the same result, and both can include special tasks at the start, end, or anywhere in between. But as noted earlier the parameterized version has several disadvantages: it must be finite in extent and not too large; the date-time arithmetic has to be done by the user; and the full extent of the workflow will be visible at all times as the suite runs.


Here’s a yearly-cycling suite with four parameterized chunks in each cycle point:
Note the inter-cycle trigger that connects the first chunk in each cycle point to the last chunk in the previous cycle point. Of course it would be simpler to just use 3-monthly cycling:
Here’s a possible valid use-case for mixed cycling: consider a portable date-time cycling workflow of model jobs that can each take too long to run on some supported platforms. This could be handled without changing the cycling structure of the suite by splitting the run (at each cycle point) into a variable number of shorter steps, using more steps on less powerful hosts.
Cycle Point And Parameter Offsets At Start-Up In cycling workflows cylc ignores anything earlier than the suite initial cycle point. So this graph:
simplifies at the initial cycle point to this:
Similarly, parameter offsets are ignored if they extend beyond the start of the parameter value list. So this graph:
simplifies for chunk=1 to this:
Note however that the initial cut-off applies to every parameter list, but only to cycle point sequences that start at the suite initial cycle point. Therefore it may be somewhat easier to use parameterized cycling if you need multiple date-time sequences with different start points in the same suite. We plan to allow this sequence-start simplification for any date-time sequence in the future, not just at the suite initial point, but it needs to be optional because delayed-start cycling tasks sometimes need to trigger off earlier cycling tasks.
This section needs to be revised - the Parameterized Task feature introduced in cylc-6.11.0 (see Section 9.6) provides a cleaner way to auto-generate tasks without coding messy Jinja2 loops.
Cylc has built in support for the Jinja2 template processor in suite definitions. Jinja2 variables, mathematical expressions, loop control structures, conditional logic, etc., are automatically processed to generate the final suite definition seen by cylc.
The need for Jinja2 processing must be declared with a hash-bang comment as the first line of the suite.rc file:
Potential uses for this include automatic generation of repeated groups of similar tasks and dependencies, and inclusion or exclusion of entire suite sections according to the value of a single flag. Consider a large complicated operational suite and several related parallel test suites with slightly different task content and structure (the parallel suites, for instance, might take certain large input files from the operation or the archive rather than downloading them again) - these can now be maintained as a single master suite definition that reconfigures itself according to the value of a flag variable indicating the intended use.
Template processing is the first thing done on parsing a suite definition so Jinja2 expressions can appear anywhere in the file (inside strings and namespace headings, for example).
Jinja2 is well documented at http://jinja.pocoo.org/docs, so here we just provide an example suite that uses it. The meaning of the embedded Jinja2 code should be reasonably self-evident to anyone familiar with standard programming techniques.
The jinja2.ensemble example, graphed in Figure 32, shows an ensemble of similar tasks generated using Jinja2:
Here is the generated suite definition, after Jinja2 processing:
And finally, the jinja2.cities example uses variables, includes or excludes special cleanup tasks according to the value of a logical flag, and it automatically generates all dependencies and family relationships for a group of tasks that is repeated for each city in the suite. To add a new city and associated tasks and dependencies simply add the city name to list at the top of the file. The suite is graphed, with the New York City task family expanded, in Figure 33.
This functionality is not provided by Jinja2 by default, but cylc automatically imports the user environment to the template in a dictionary structure called environ. A usage example:
This example is emphasizes that the environment is read on the suite host at the time the suite definition is parsed - it is not, for instance, read at task run time on the task host.
Jinja2 variable values can be modified by “filters”, using pipe notation. For example, the built-in trim filter strips leading and trailing white space from a string:
(See official Jinja2 documentation for available built-in filters.)
Cylc also supports custom Jinja2 filters. A custom filter is a single Python function in a source file with the same name as the function (plus “.py” extension) and stored in one of the following locations:
In the filter function argument list, the first argument is the variable value to be “filtered”, and subsequent arguments can be whatever is needed. Currently there are two custom filters:
pad The “pad” filter is for padding string values to some constant length with a fill character - useful for generating task names and related values in ensemble suites:
strftime The “strftime” filter can be used to format ISO8601 date-time strings using an strftime string.
Examples:
It is also possible to parse non-standard date-time strings by passing a strptime string as the second argument.
Examples:
Associative arrays (dicts in Python) can be very useful. Here’s an example, from
<cylc-dir>/etc/examples/jinja2/dict:
Here’s the result:
The values of Jinja2 variables can be passed in from the cylc command line rather than hardwired in the suite
definition. Here’s an example, from
<cylc-dir>/etc/examples/jinja2/defaults:
Here’s the result:
Note also that cylc view --set FIRST_TASK=bob --jinja2 SUITE will show the suite with the Jinja2 variables as set.
Note: suites started with template variables set on the command line will restart with the same settings. However, you can set them again on the cylc restart command line if they need to be overridden.
Jinja2 variable scoping rules may be surprising. Variables set inside a for loop block, for instance, are not accessible outside of the block, so the following will print #FOO is 0, not #FOO is 9:
Jinja2 documentation suggests using alternative constructs like the loop else block or the special loop variable. More complex use cases can be handled using namespace objects which allow propagating of changes across scopes:
For detail, see: Jinja2 Template Designer Documentation ¿ Assignments
Cylc provides two functions for raising exceptions using Jinja2. These exceptions are raised when the suite.rc file is loaded and will prevent a suite from running.
Note: These functions must be contained within {{ Jinja2 blocks as opposed to {% blocks.
Raise The “raise” function will result in an error containing the provided text.
Assert The “assert” function will raise an exception containing the text provided in the second argument providing that the first argument evaluates as False. The following example is equivalent to the “raise” example above.
It is sometimes convenient to omit certain tasks from the suite at runtime without actually deleting their definitions from the suite.
Defining [runtime] properties for tasks that do not appear in the suite graph results in verbose-mode validation warnings that the tasks are disabled. They cannot be used because the suite graph is what defines their dependencies and valid cycle points. Nevertheless, it is legal to leave these orphaned runtime sections in the suite definition because it allows you to temporarily remove tasks from the suite by simply commenting them out of the graph.
To omit a task from the suite at runtime but still leave it fully defined and available for use (by insertion or cylc submit) use one or both of [scheduling][[special task]] lists, include at start-up or exclude at start-up (documented in A.4.11.6 and A.4.11.5). Then the graph still defines the validity of the tasks and their dependencies, but they are not actually loaded into the suite at start-up. Other tasks that depend on the omitted ones, if any, will have to wait on their insertion at a later time or otherwise be triggered manually.
Finally, with Jinja2 (9.7) you can radically alter suite structure by including or excluding tasks from the [scheduling] and [runtime] sections according to the value of a single logical flag defined at the top of the suite.
A naked dummy task appears in the suite graph but has no explicit runtime configuration section. Such tasks automatically inherit the default “dummy task” configuration from the root namespace. This is very useful because it allows functional suites to be mocked up quickly for test and demonstration purposes by simply defining the graph. It is somewhat dangerous, however, because there is no way to distinguish an intentional naked dummy task from one generated by typographic error: misspelling a task name in the graph results in a new naked dummy task replacing the intended task in the affected trigger expression; and misspelling a task name in a runtime section heading results in the intended task becoming a dummy task itself (by divorcing it from its intended runtime config section).
To avoid this problem any dummy task used in a real suite should not be naked - i.e. it should have an explicit entry in under the runtime section of the suite definition, even if the section is empty. This results in exactly the same dummy task behaviour, via implicit inheritance from root, but it allows use of cylc validate --strict to catch errors in task names by failing the suite if any naked dummy tasks are detected.
Existing scripts and executables can be used as cylc tasks without modification so long as they return standard exit status - zero on success, non-zero for failure - and do not spawn detaching processes internally (see 10.5).
When the suite dameon determines that a task is ready to run it generates a job script that embodies the task runtime configuration in the suite.rc file, and submits it to the configured job host and batch system (see 11).
Task job scripts are written to the suite’s job log directory. They can be printed with cylc cat-log or generated and printed with cylc jobscript.
Task script items can be multi-line strings of bash code, so many tasks can be entirely inlined in the suite.rc file. For anything more than a few lines of code, however, we recommend using external shell scripts to allow independent testing, re-use, and shell mode editing.
Tasks messages can be sent back to the suite server program to report completed outputs and arbitrary messages of different severity levels.
Some types of message - in addition to events like task failure - can optionally trigger execution of event handlers in the suite server program (see 12.19).
Normal severity messages are printed to job.out and logged by the suite server program:
CUSTOM severity messages are printed to job.out, logged by the suite server program, and can be used to trigger custom event handlers:
Custom severity messages and event handlers can be used to signal special events that are neither routine information or an error condition, such as production of a particular data file. Task output messages, used for triggering other tasks, can also be sent with custom severity if need be.
WARNING severity messages are printed to job.err, logged by the suite server program, and can be passed to warning event handlers:
CRITICAL severity messages are printed to job.err, logged by the suite server program, and can be passed to critical event handlers:
Task job scripts use set -x to abort on any error, and trap ERR, EXIT, and SIGTERM to send task failed messages back to the suite server program before aborting. Other scripts called from job scripts should therefore abort with standard non-zero exit status on error, to trigger the job script error trap.
To prevent a command that is expected to generate a non-zero exit status from triggering the exit trap, protect it with a control statement such as:
Task job scripts also use set -u to abort on referencing any undefined variable (useful for picking up typos); and set -o pipefail to abort if any part of a pipe fails (by default the shell only returns the exit status of the final command in a pipeline).
Critical events normally warrant aborting a job script rather than just sending a message. As described just above, exit 1 or any failing command not protected by the surrounding scripting will cause a job script to abort and report failure to the suite server program, potentially triggering a failed task event handler.
For failures detected by the scripting you could send a critical message back before aborting, potentially triggering a critical task event handler:
To abort a job script with a custom message that can be passed to a failed task event handler, use the built-in cylc__job_abort shell function:
If a task script starts background sub-processes and does not wait on them, or internally submits jobs to a batch scheduler and then exits immediately, the detached processes will not be visible to cylc and the task will appear to finish when the top-level script finishes. You will need to modify scripts like this to make them execute all sub-processes in the foreground (or use the shell wait command to wait on them before exiting) and to prevent job submission commands from returning before the job completes (e.g. llsubmit -s for Loadleveler, qsub -sync yes for Sun Grid Engine, and qsub -W block=true for PBS).
If this is not possible - perhaps you don’t have control over the script or can’t work out how to fix it - one alternative approach is to use another task to repeatedly poll for the results of the detached processes:
For the requirements a command, script, or program, must fulfill in order to function as a cylc task, see 10. This section explains how tasks are submitted by the suite server program when they are ready to run, and how to define new batch system handlers.
When a task is ready cylc generates a job script (see 10.1). The job script is submitted to run by the batch system chosen for the task. Different tasks can use different batch systems. Like other runtime properties, you can set a suite default batch system and override it for specific tasks or families:
Cylc supports a number of commonly used batch systems. See 11.7 for how to add new job submission methods.
Runs task job scripts as Unix background processes.
If an execution time limit is specified for a task, its job will be wrapped by the timeout command.
Submits task job scripts to the rudimentary Unix at scheduler. The atd daemon must be running.
If an execution time limit is specified for a task, its job will be wrapped by the timeout command.
Submits task job scripts to loadleveler by the llsubmit command. Loadleveler directives can be provided in the suite.rc file:
These are written to the top of the task job script like this:
If restart=yes is specified as a directive for loadleveler, the job will automatically trap SIGUSR1, which loadleveler may use to preempt the job. On trapping SIGUSR1, the job will inform the suite that it has been vacated by loadleveler. This will put it back to the submitted state, until it starts running again.
If execution time limit is specified, it is used to generate the wall_clock_limit directive. The setting is assumed to be the soft limit. The hard limit will be set by adding an extra minute to the soft limit. Do not specify the wall_clock_limit directive explicitly if execution time limit is specified. Otherwise, the execution time limit known by the suite may be out of sync with what is submitted to the batch system.
Submits task job scripts to IBM Platform LSF by the bsub command. LSF directives can be provided in the suite.rc file:
These are written to the top of the task job script like this:
If execution time limit is specified, it is used to generate the -W directive. Do not specify the -W directive explicitly if execution time limit is specified. Otherwise, the execution time limit known by the suite may be out of sync with what is submitted to the batch system.
Submits task job scripts to PBS (or Torque) by the qsub command. PBS directives can be provided in the suite.rc file:
These are written to the top of the task job script like this:
If execution time limit is specified, it is used to generate the -l walltime directive. Do not specify the -l walltime directive explicitly if execution time limit is specified. Otherwise, the execution time limit known by the suite may be out of sync with what is submitted to the batch system.
Submits task job scripts to the Moab workload manager by the msub command. Moab directives can be provided in the suite.rc file; the syntax is very similar to PBS:
These are written to the top of the task job script like this:
(Moab understands #PBS directives).
If execution time limit is specified, it is used to generate the -l walltime directive. Do not specify the -l walltime directive explicitly if execution time limit is specified. Otherwise, the execution time limit known by the suite may be out of sync with what is submitted to the batch system.
Submits task job scripts to Sun/Oracle Grid Engine by the qsub command. SGE directives can be provided in the suite.rc file:
These are written to the top of the task job script like this:
If execution time limit is specified, it is used to generate the -l h_rt directive. Do not specify the -l h_rt directive explicitly if execution time limit is specified. Otherwise, the execution time limit known by the suite may be out of sync with what is submitted to the batch system.
Submits task job scripts to Simple Linux Utility for Resource Management by the sbatch command. SLURM directives can be provided in the suite.rc file (note that since not all SLURM commands have a short form, cylc requires the long form directives):
These are written to the top of the task job script like this:
If execution time limit is specified, it is used to generate the --time directive. Do not specify the --time directive explicitly if execution time limit is specified. Otherwise, the execution time limit known by the suite may be out of sync with what is submitted to the batch system.
For batch systems that use job file directives (PBS, Loadleveler, etc.) default directives are provided to set the job name, stdout and stderr file paths, and the execution time limit (if specified).
Cylc constructs the job name string using a combination of the task ID and the suite name. PBS fails a job submit if the job name in -N name is too long. For version 12 or below, this is 15 characters. For version 13, this is 236 characters. The default setting will truncate the job name string to 15 characters. If you have PBS 13 at your site, you should modify your site’s global configuration file to allow the job name to be longer. (See also Section B.9.1.18.5.) For example:
To specify an option with no argument, such as -V in PBS or -cwd in SGE you must give a null string as the directive value in the suite.rc file.
The left hand side of a setting (i.e. the string before the first equal sign) must be unique. To specify multiple values using an option such as -l option in PBS, SGE, etc., either specify all items in a single line:
(Left hand side is -l. A second -l=... line will override the first.)
Or separate the items (note: no equal sign after -l):
(Left hand sides are now -l select, -l ncpus, etc.)
When a task is ready to run cylc generates a filename root to be used for the task job script and log files. The filename containing the task name, cycle point, and a submit number that increments if the same task is re-triggered multiple times:
How the stdout and stderr streams are directed into these files depends on the batch system. The background method just uses appropriate output redirection on the command line, as shown above. The loadleveler method writes appropriate directives to the job script that is submitted to loadleveler.
Cylc obviously has no control over the stdout and stderr output from tasks that do their own internal output management (e.g. tasks that submit internal jobs and direct the associated output to other files). For less internally complex tasks, however, the files referred to here will be complete task job logs.
Some batch systems, such as pbs, redirect a job’s stdout and stderr streams to a separate cache area while the job is running. The contents are only copied to the normal locations when the job completes. This means that cylc cat-log or the gcylc GUI will be unable to find the job’s stdout and stderr streams while the job is running. Some sites with these batch systems are known to provide commands for viewing and/or tail-follow a job’s stdout and stderr streams that are redirected to these cache areas. If this is the case at your site, you can configure cylc to make use of the provided commands by adding some settings to the global site/user config. E.g.:
To change the form of the actual command used to submit a job you do not need to define a new batch system handler; just override the command template in the relevant job submission sections of your suite.rc file:
As explained in A the template’s %(job)s will be substituted by the job file path.
For supported batch systems, one-way polling can be used to determine actual job status: the suite server program executes a process on the task host, by non-interactive ssh, to interrogate the batch queueing system there, and to read a status file that is automatically generated by the task job script as it runs.
Polling may be required to update the suite state correctly after unusual events such as a machine being rebooted with tasks running on it, or network problems that prevent task messages from getting back to the suite host.
Tasks can be polled on demand by right-clicking on them in gcylc or using the cylc poll command.
Tasks are polled automatically, once, if they timeout while queueing in a batch scheduler and submission timeout is set. (See A.5.1.13 for how to configure timeouts).
Tasks are polled multiple times, where necessary, when they exceed their execution time limits. These are normally set with some initial delays to allow the batch systems to kill the jobs. (See B.9.1.18.6 for how to configure the polling intervals).
Any tasks recorded in the submitted or running states at suite restart are automatically polled to determine what happened to them while the suite was down.
Regular polling can also be configured as a health check on tasks submitted to hosts that are known to be flaky, or as the sole method of determining task status on hosts that do not allow task messages to be routed back to the suite host.
To use polling instead of task-to-suite messaging set task communication method = poll in cylc site and user global config (see B.9.1.3). The default polling intervals can be overridden for all suites there too (see B.9.1.5 and B.9.1.4), or in specific suite definitions (in which case polling will be done regardless of the task communication method configured for the host; see A.5.1.11.7 and A.5.1.11.8).
Note that regular polling is not as efficient as task messaging in updating task status, and it should be used sparingly in large suites.
Note that for polling to work correctly, the batch queueing system must have a job listing command for listing your jobs, and that the job listing must display job IDs as they are returned by the batch queueing system submit command. For example, for pbs, moab and sge, the qstat command should list jobs with their IDs displayed in exactly the same format as they are returned by the qsub command.
For supported batch systems, the suite server program can execute a process on the task host, by non-interactive ssh, to kill a submitted or running job according to its batch system.
Tasks can be killed on demand by right-clicking on them in gcylc or using the cylc kill command.
You can specify an execution time limit for all supported job submission methods. E.g.:
For tasks running with background or at, their jobs will be wrapped using the timeout command. For all other methods, the relevant time limit directive will be added to their job files.
The execution time limit setting will also inform the suite when a a task job should complete by. If a task job has not reported completing within the specified time, the suite will poll the task job. (The default setting is PT1M, PT2M, PT7M. The accumulated times for these intervals will be roughly 1 minute, 1 + 2 = 3 minutes and 1 + 2 + 7 = 10 minutes after a task job exceeds its execution time limit.)
If you specify an execution time limit the execution timeout event handler will only be called if the job has not completed after the final poll (by default, 10 min after the time limit). This should only happen if the submission method you are using is not enforcing wallclock limits (unlikely) or you are unable to contact the machine to confirm the job status.
If you specify an execution timeout and not an execution time limit then the execution timeout event handler will be called as soon as the specified time is reached. The job will also be polled to check its latest status (possibly resulting in an update in its status and the calling of the relevant event handler). This behaviour is deprecated, which users should avoid using.
If you specify an execution timeout and an execution time limit then the execution timeout setting will be ignored.
Defining a new batch system handler requires a little Python programming. Use the built-in handlers as examples, and read the documentation in lib/cylc/batch_sys_manager.py.
The following qsub.py module overrides the built-in pbs batch system handler to to change the directive prefix from #PBS to #QSUB:
If this is in the Python search path (see 11.7.2 below) you can use it by name in suite definitions:
Generate a job script to see the resulting directives:
(Of course this suite will fail at run time because we only changed the directive format, and PBS does not accept #QSUB directives in reality).
Custom batch system handlers must be installed on suite and job hosts in one of these locations:
(A note for Rose users: rose suite-run automatically installs SUITE-DEF-PATH/lib/python/ to job hosts).
This chapter currently features a diverse collection of topics related to running suites. Please also see the Tutorial (7) and command documentation (F), and experiment with plenty of examples.
There are three ways to start a suite running: cold start and warm start, which start from scratch; and restart, which starts from a prior suite state checkpoint. The only difference between cold starts and warm starts is that warm starts start from a point beyond the suite initial cycle point.
Once a suite is up and running it is typically a restart that is needed most often (but see also cylc reload). Be aware that cold and warm starts wipe out prior suite state, so you can’t go back to a restart if you decide you made a mistake.
A cold start is the primary way to start a suite run from scratch:
The initial cycle point may be specified on the command line or in the suite.rc file. The scheduler starts by loading the first instance of each task at the suite initial cycle point, or at the next valid point for the task.
A warm start runs a suite from scratch like a cold start, but from the beginning of a given cycle point that is beyond the suite initial cycle point. This is generally inferior to a restart (which loads a previously recorded suite state - see 12.1.3) because it may result in some tasks rerunning. However, a warm start may be required if a restart is not possible, e.g. because the suite run database was accidentally deleted. The warm start cycle point must be given on the command line:
The original suite initial cycle point is preserved, but all tasks and dependencies before the given warm start cycle point are ignored.
The scheduler starts by loading a first instance of each task at the warm start cycle point, or at the next valid point for the task. R1-type tasks behave exactly the same as other tasks - if their cycle point is at or later than the given start cycle point, they will run; if not, they will be ignored.
At restart (see cylc restart --help) a suite server program initializes its task pool from a previously recorded checkpoint state. By default the latest automatic checkpoint - which is updated with every task state change - is loaded so that the suite can carry on exactly as it was just before being shut down or killed.
Tasks recorded in the ‘submitted’ or ‘running’ states are automatically polled (see Section 12.5) at start-up to determine what happened to them while the suite was down.
Restart From Latest Checkpoint To restart from the latest checkpoint simply invoke the cylc restart command with the suite name (or select ‘restart’ in the GUI suite start dialog window):
Restart From Another Checkpoint Suite server programs automatically update the “latest” checkpoint every time a task changes state, and at every suite restart, but you can also take checkpoints at other times. To tell a suite server program to checkpoint its current state:
The 2nd argument is a name to identify the checkpoint later with:
For example, with checkpoints named ‘bob’, ‘alice’, and ‘breakfast’:
To see the actual task state content of a given checkpoint ID (if you need to), for the moment you have to interrogate the suite DB, e.g.:
Note that a checkpoint captures the instantaneous state of every task in the suite, including any tasks that are currently active, so you may want to be careful where you do it. Tasks recorded as active are polled automatically on restart to determine what happened to them.
The checkpoint ID 0 (zero) is always used for latest state of the suite, which is updated continuously as the suite progresses. The checkpoint IDs of earlier states are positive integers starting from 1, incremented each time a new checkpoint is stored. Currently suites automatically store checkpoints before and after reloads, and on restarts (using the latest checkpoints before the restarts).
Once you have identified the right checkpoint, restart the suite like this:
or enter the checkpoint ID in the space provided in the GUI restart window.
Checkpointing With A Task Checkpoints can be generated automatically at particular points in the workflow by coding tasks that run the cylc checkpoint command:
Note that we need to wait on the “task started” message - which is sent in the background to avoid holding tasks up in a network outage - to ensure that the checkpointer task is correctly recorded as running in the checkpoint (at restart the suite server program will poll to determine that that task job finished successfully). Otherwise it may be recorded in the waiting state and, if its upstream dependencies have already been cleaned up, it will need to be manually reset from waiting to succeeded after the restart to avoid stalling the suite.
Behaviour of Tasks on Restart All tasks are reloaded in exactly their checkpointed states. Failed tasks are not automatically resubmitted at restart in case the underlying problem has not been addressed yet.
Tasks recorded in the submitted or running states are automatically polled on restart, to see if they are still waiting in a batch queue, still running, or if they succeeded or failed while the suite was down. The suite state will be updated automatically according to the poll results.
Existing instances of tasks removed from the suite definition before restart are not removed from the task pool automatically, but they will not spawn new instances. They can be removed manually if necessary, with cylc remove.
Similarly, instances of new tasks added to the suite definition before restart are not inserted into the task pool automatically, because it is very difficult in general to automatically determine the cycle point of the first instance. Instead, the first instance of a new task should be inserted manually at the right cycle point, with cylc insert.
The cylc reload command tells a suite server program to reload its suite definition at run time. This is an alternative to shutting a suite down and restarting it after making changes.
As for a restart, existing instances of tasks removed from the suite definition before reload are not removed from the task pool automatically, but they will not spawn new instances. They can be removed manually if necessary, with cylc remove.
Similarly, instances of new tasks added to the suite definition before reload are not inserted into the pool automatically. The first instance of each must be inserted manually at the right cycle point, with cylc insert.
Task jobs need access to Cylc on the job host, primarily for task messaging, but also to allow user-defined task scripting to run other Cylc commands.
Cylc should be installed on job hosts as on suite hosts, with different releases installed side-by-side and invoked via the central Cylc wrapper according to the value of $CYLC_VERSION - see Section 3.3. Task job scripts set $CYLC_VERSION to the version of the parent suite server program, so that the right Cylc will be invoked by jobs on the job host.
Access to the Cylc executable (preferably the central wrapper as just described) for different job hosts can be configured using site and user global configuration files (on the suite host). If the environment for running the Cylc executable is only set up correctly in a login shell for a given host, you can set [hosts][HOST]use login shell = True for the relevant host (this is the default, to cover more sites automatically). If the environment is already correct without the login shell, but the Cylc executable is not in $PATH, then [hosts][HOST]cylc executable can be used to specify the direct path to the executable.
To customize the environment more generally for Cylc on jobs hosts, use of job-init-env.sh is described in Section 7.1.1.
At start-up, suite server programs write a suite contact file $HOME/cylc-run/SUITE/.service/contact that records suite host, user, port number, process ID, Cylc version, and other information. Client commands can read this file, if they have access to it, to find the target suite server program.
At any point after job submission task jobs can be polled to check that their true state conforms to what is currently recorded by the suite server program. See cylc poll --help for how to poll one or more tasks manually, or right-click poll a task or family in GUI.
Polling may be necessary if, for example, a task job gets killed by the untrappable SIGKILL signal (e.g. kill -9 PID), or if a network outage prevents task success or failure messages getting through, or if the suite server program itself is down when tasks finish execution.
To poll a task job the suite server program interrogates the batch system, and the job.status file, on the job host. This information is enough to determine the final task status even if the job finished while the suite server program was down or unreachable on the network.
Task jobs are automatically polled at certain times: once on job submission timeout; several times on exceeding the job execution time limit; and at suite restart any tasks recorded as active in the suite state checkpoint are polled to find out what happened to them while the suite was down.
Finally, in necessary routine polling can be configured as a way to track job status on job hosts that do not allow networking routing back to the suite host for task messaging by HTTPS or ssh. See 12.6.3.
Cylc supports three ways of tracking task state on job hosts:
These are explained in the following sections. All three can be used, on different job hosts, in the same suite if necessary.
If your site prohibits HTTPS and ssh back from job hosts to suite hosts, before resorting to the polling method you should consider installing dedicated Cylc servers or VMs inside the HPC trust zone (where HTTPS and ssh should be allowed).
It is also possible to run Cylc suite server programs on HPC login nodes, but this is not recommended for load, run duration, and GUI reasons.
Finally, it has been suggested that port forwarding may provide another solution - but that is beyond the scope of this document.
Task job wrappers automatically invoke cylc message to report progress back to the suite server program when they begin executing, at normal exit (success) and abnormal exit (failure).
By default the messaging occurs via an authenticated, HTTPS connection to the suite server program. This is the preferred task communications method - it is efficient and direct.
Suite server programs automatically install suite contact information and credentials on job hosts. Users only need to do this manually for remote access to suites on other hosts, or suites owned by other users - see 12.11.
Cylc can be configured to re-invoke task messaging commands on the suite host via non-interactive ssh (from job host to suite host). Then a local HTTPS connection is made to the suite server program.
(User-invoked client commands (aside from the GUI, which requires HTTPS) can do the same thing with the --use-ssh command option).
This is less efficient than direct HTTPS messaging, but it may be useful at sites where the HTTPS ports are blocked but non-interactive ssh is allowed.
Finally, suite server programs can actively poll task jobs at configurable intervals, via non-interactive ssh to the job host.
Polling is the least efficient task communications method because task state is updated only at intervals, not when task events actually occur. However, it may be needed at sites that do not allow HTTPS or non-interactive ssh from job host to suite host.
Be careful to avoid spamming task hosts with polling commands. Each poll opens (and then closes) a new ssh connection.
Polling intervals are configurable under [runtime] because they should may depend on the expected execution time. For instance, a task that typically takes an hour to run might be polled every 10 minutes initially, and then every minute toward the end of its run. Interval values are used in turn until the last value, which is used repeatedly until finished:
A list of intervals with optional multipliers can be used for both submission and execution polling, although a single value is probably sufficient for submission polling. If these items are not configured default values from site and user global config will be used for the polling task communication method; polling is not done by default under the other task communications methods (but it can still be used if you like).
Here are the default site and user global config items relevant to task state tracking (see these with cylc get-site-config):
At registration time a suite service directory, $HOME/cylc-run/<SUITE>/.service/, is created and populated with a private passphrase file (containing random text), a self-signed SSL certificate (see 12.9), and a symlink to the suite source directory. An existing passphrase file will not be overwritten if a suite is re-registered.
At run time, the private suite run database is also written to the service directory, along with a suite contact file that records the host, user, port number, process ID, Cylc version, and other information about the suite server program. Client commands automatically read daemon targetting information from the contact file, if they have access to it.
Some Cylc commands and GUI actions parse suite definitions or read other files from the suite host account, rather than communicate with a suite server program over the network. In future we plan to have suite server program serve up these files to clients, but for the moment this functionality requires read-access to the relevant files on the suite host.
If you are logged into the suite host account, file-reading commands will just work.
If you are logged into another host with shared home directories (shared filesystems are common in HPC environments) file-reading commands will just work because suite files will look “local” on both hosts.
If you are logged into another host with no shared home directory, file-reading commands require non-interactive ssh to the suite host account, and use of the --host and --user options to re-invoke the command on the suite account.
(This is essentially the same as Remote Host, Different Home Directory.)
Cylc server programs listen on dedicated network ports for HTTPS communications from Cylc clients (task jobs, and user-invoked commands and GUIs).
Use cylc scan to see which suites are listening on which ports on scanned hosts (this lists your own suites by default, but it can show others too - see cylc scan --help).
Cylc supports two kinds of access to suite server programs:
Without a suite passphrase the amount of information revealed by a suite server program is determined by the public access privilege level set in global site/user config (B.15) and optionally overidden in suites (A.3.16):
The default public access level is state-totals.
The cylc scan command and the cylc gscan GUI can print descriptions and task state totals in addition to basic suite identity, if the that information is revealed publicly.
Suite auth files (passphrase and SSL certificate) give full control. They are loaded from the suite service directory by the suite server program at start-up, and used to authenticate subsequent client connections. Passphrases are used in a secure encrypted challenge-response scheme, never sent in plain text over the network.
If two users need access to the same suite server program, they must both possess the passphrase file for that suite. Fine-grained access to a single suite server program via distinct user accounts is not currently supported.
Suite server programs automatically install their auth and contact files to job hosts via ssh, to enable task jobs to connect back to the suite server program for task messaging.
Client programs invoked by the suite owner automatically load the passphrase, SSL certificate, and contact file too, for automatic connection to suites.
Manual installation of suite auth files is only needed for remote control, if you do not have a shared filesystem - see below.
The gcylc GUI is mainly a network client to retrieve and display suite status information from the suite server program, but it can also invoke file-reading commands to view and graph the suite definition and so on. This is entirely transparent if the GUI is running on the suite host account, but full functionality for remote suites requires either a shared filesystem, or (see 12.11) auth file installation and non-interactive ssh access to the suite host. Without the auth files you will not be able to connect to the suite, and without ssh you will see “permission denied” errors on attempting file access.
Cylc client programs - command line and GUI - can interact with suite server programs running on other accounts or hosts. How this works depends on whether or not you have:
With a shared filesystem, a suite registered on the remote (server) host is also - in effect - registered on the local (client) host. In this case you can invoke client commands without the --host option; the client will automatically read the host and port from the contact file in the suite service directory.
To control suite server programs running under other user accounts or on other hosts without a shared filesystem, the suite SSL certificate and passphrase must be installed under your $HOME/.cylc/ directory:
where OWNER@HOST is the suite host account and SUITE is the suite name. Client commands should then be invoked with the --user and --host options, e.g.:
Note remote suite auth files do not need to be installed for read-only access - see 12.9.1 - via the GUI or monitor.
The suite contact file (see 12.4) is not needed if you have read-access to the remote suite run directory via the local filesystem or non-interactive ssh to the suite host account - client commands will automatically read it. If you do install the contact file in your auth directory note that the port number will need to be updated if the suite gets restarted on a different port. Otherwise use cylc scan to determine the suite port number and use the --port client command option.
WARNING: possession of a suite passphrase gives full control over the target suite, including edit run functionality - which lets you run arbitrary scripting on job hosts as the suite owner. Further, non-interactive ssh gives full access to the target user account, so we recommended that this is only used to interact with suites running on accounts to which you already have full access.
Both cylc scan and the cylc gscan GUI can display suites owned by other users on other hosts, including task state totals if the public access level permits that (see 12.9.1). Clicking on a remote suite in gscan will open a cylc gui to connect to that suite. This will give you full control, if you have the suite auth files installed; or it will display full read only information if the public access level allows that.
As a suite runs, its task proxies may pass through the following states:
The GUI Text-tree and Dot Views display the state of every task proxy present in the task pool. Once a task has succeeded and Cylc has determined that it can no longer be needed to satisfy the prerequisites of other tasks, its proxy will be cleaned up (removed from the pool) and it will disappear from the GUI. To rerun a task that has disappeared from the pool, you need to re-insert its task proxy and then re-trigger it.
The Graph View is slightly different: it displays the complete dependency graph over the range of cycle points currently present in the task pool. This often includes some greyed-out base or ghost nodes that are empty - i.e. there are no corresponding task proxies currently present in the pool. Base nodes just flesh out the graph structure. Groups of them may be cut out and replaced by single scissor nodes in sections of the graph that are currently inactive.
A connection timeout can be set in site and user global config files (see 6) so that messaging commands cannot hang indefinitely if the suite is not responding (this can be caused by suspending a suite with Ctrl-Z) thereby preventing the task from completing. The same can be done on the command line for other suite-connecting user commands, with the --comms-timeout option.
Runahead limiting prevents the fastest tasks in a suite from getting too far ahead of the slowest ones. Newly spawned tasks are released to the task pool only when they fall below the runahead limit. A low runhead limit can prevent cylc from interleaving cycles, but it will not stall a suite unless it fails to extend out past a future trigger (see 9.3.5.11). A high runahead limit may allow fast tasks that are not constrained by dependencies or clock-triggers to spawn far ahead of the pack, which could have performance implications for the suite server program when running very large suites. Succeeded and failed tasks are ignored when computing the runahead limit.
The preferred runahead limiting mechanism restricts the number of consecutive active cycle points. The default value is three active cycle points; see A.4.8. Alternatively the interval between the slowest and fastest tasks can be specified as hard limit; see A.4.7.
Large suites can potentially overwhelm task hosts by submitting too many tasks at once. You can prevent this with internal queues, which limit the number of tasks that can be active (submitted or running) at the same time.
Internal queues behave in the first-in-first-out (FIFO) manner, i.e. tasks are released from a queue in the same order that they were queued.
A queue is defined by a name; a limit, which is the maximum number of active tasks allowed for the queue; and a list of members, assigned by task or family name.
Queue configuration is done under the [scheduling] section of the suite.rc file (like dependencies, internal queues constrain when a task runs).
By default every task is assigned to the default queue, which by default has a zero limit (interpreted by cylc as no limit). To use a single queue for the whole suite just set the default queue limit:
To use additional queues just name each one, set their limits, and assign members:
Any tasks not assigned to a particular queue will remain in the default queue. The queues example suite illustrates how queues work by running two task trees side by side (as seen in the graph GUI) each limited to 2 and 3 tasks respectively:
See also A.5.1.11.6 in the Suite.rc Reference.
Tasks can be configured with a list of “retry delay” intervals, as ISO 8601 durations. If the task job fails it will go into the retrying state and resubmit after the next configured delay interval. An example is shown in the suite listed below under 12.19.
If a task with configured retries is killed (by cylc kill or via the GUI) it goes to the held state so that the operator can decide whether to release it and continue the retry sequence or to abort the retry sequence by manually resetting it to the failed state.
See also A.3.13 and A.5.1.13 in the Suite.rc Reference.
Cylc can call nominated event handlers - to do whatever you like - when certain suite or task events occur. This facilitates centralized alerting and automated handling of critical events. Event handlers can be used to send a message, call a pager, or whatever; they can even intervene in the operation of their own suite using cylc commands.
To send an email, use the built-in setting [[[events]]]mail events to specify a list of events for which notifications should be sent. E.g. to send an email on (submission) failed and retry:
By default, the emails will be sent to the current user with:
These can be configured using the settings:
By default, a cylc suite will send you no more than one task event email every 5 minutes - this is to prevent your inbox from being flooded by emails should a large group of tasks all fail at similar time. See A.3.8 for details.
Event handlers can be located in the suite bin/ directory; otherwise it is up to you to ensure their location is in $PATH (in the shell in which the suite server program runs). They should require little resource and return quickly - see 12.20.
Task event handlers can be specified using the [[[events]]]<event> handler settings, where <event> is one of:
The value of each setting should be a list of command lines or command line templates (see below).
Alternatively you can use [[[events]]]handlers and [[[events]]]handler events, where the former is a list of command lines or command line templates (see below) and the latter is a list of events for which these commands should be invoked.
Event handler arguments can be constructed from various templates representing suite name; task ID, name, cycle point, message, and submit number name; and any suite or task [meta] item. See A.3.13 and A.5.1.13 for options.
If no template arguments are supplied the following default command line will be used:
Note: substitution patterns should not be quoted in the template strings. This is done automatically where required.
For an explanation of the substitution syntax, see String Formatting Operations in the Python documentation.
The retry event occurs if a task fails and has any remaining retries configured (see 12.18). The event handler will be called as soon as the task fails, not after the retry delay period when it is resubmitted.
Note that event handlers are called by the suite server program, not by task jobs. If you wish to pass additional information to them use [cylc] →[[environment]], not task runtime environment.
The following 2 suite.rc snippets are examples on how to specify event handlers using the alternate methods:
The handler command here - specified with no arguments - is called with the default arguments, like this:
You may want to be notified when certain tasks are running late in a real time production system - i.e. when they have not triggered by the usual time. Tasks of primary interest are not normally clock-triggered however, so their trigger times are mostly a function of how the suite runs in its environment, and even external factors such as contention with other suites.4
But if your system is reasonably stable from one cycle to the next such that a given task has consistently triggered by some interval beyond its cycle point, you can configure Cylc to emit a late event if it has not triggered by that time. For example, if a task forecast normally triggers by 30 minutes after its cycle point, configure late notification for it like this:
Late offset intervals are not computed automatically so be careful to update them after any change that affects triggering times.
Note that Cylc can only check for lateness in tasks that it is currently aware of. If a suite gets delayed over many cycles the next tasks coming up can be identified as late immediately, and subsequent tasks can be identified as late as the suite progresses to subsequent cycle points, until it catches up to the clock.
Job submission commands, event handlers, and job poll and kill commands, are executed by the suite server program in a “pool” of asynchronous subprocesses, in order to avoid holding the suite up. The process pool is actively managed to limit it to a configurable size (B.1.2). Custom event handlers should be light-weight and quick-running because they will tie up a process pool member until they complete, and the suite will appear to stall if the pool is saturated with long-running processes. Processes are killed after a configurable timeout (B.1.3) however, to guard against rogue commands that hang indefinitely. All process kills are logged by the suite server program. For killed job submissions the associated tasks also go to the submit-failed state.
Some HPC facilities allow job preemption: the resource manager can kill or suspend running low priority jobs in order to make way for high priority jobs. The preempted jobs may then be automatically restarted by the resource manager, from the same point (if suspended) or requeued to run again from the start (if killed).
Suspended jobs will poll as still running (their job status file says they started running, and they still appear in the resource manager queue). Loadleveler jobs that are preempted by kill-and-requeue (”job vacation”) are automatically returned to the submitted state by Cylc. This is possible because Loadleveler sends the SIGUSR1 signal before SIGKILL for preemption. Other batch schedulers just send SIGTERM before SIGKILL as normal, so Cylc cannot distinguish a preemption job kill from a normal job kill. After this the job will poll as failed (correctly, because it was killed, and the job status file records that). To handle this kind of preemption automatically you could use a task failed or retry event handler that queries the batch scheduler queue (after an appropriate delay if necessary) and then, if the job has been requeued, uses cylc reset to reset the task to the submitted state.
Any task proxy currently present in the suite can be manually triggered at any time using the cylc trigger command, or from the right-click task menu in gcylc. If the task belongs to a limited internal queue (see 12.17), this will queue it; if not, or if it is already queued, it will submit immediately.
With cylc trigger --edit (also in the gcylc right-click task menu) you can edit the generated task job script to make one-off changes before the task submits.
The cylc broadcast command overrides [runtime] settings in a running suite. This can be used to communicate information to downstream tasks by broadcasting environment variables (communication of information from one task to another normally takes place via the filesystem, i.e. the input/output file relationships embodied in inter-task dependencies). Variables (and any other runtime settings) may be broadcast to all subsequent tasks, or targeted specifically at a specific task, all subsequent tasks with a given name, or all tasks with a given cycle point; see broadcast command help for details.
Broadcast settings targeted at a specific task ID or cycle point expire and are forgotten as the suite moves on. Un-targeted variables and those targeted at a task name persist throughout the suite run, even across restarts, unless manually cleared using the broadcast command - and so should be used sparingly.
When a suite is started with the cylc run command (cold or warm start) the cycle point at which it starts can be given on the command line or hardwired into the suite.rc file:
or:
An initial cycle given on the command line will override one in the suite.rc file.
In the case of a cold start only the initial cycle point is passed through to task execution environments as $CYLC_SUITE_INITIAL_CYCLE_POINT. The value is then stored in suite database files and persists across restarts, but it does get wiped out (set to None) after a warm start, because a warm start is really an implicit restart in which all state information is lost (except that the previous cycle is assumed to have completed).
The $CYLC_SUITE_INITIAL_CYCLE_POINT variable allows tasks to determine if they are running in the initial cold-start cycle point, when different behaviour may be required, or in a normal mid-run cycle point. Note however that an initial R1 graph section is now the preferred way to get different behaviour at suite start-up.
Several suite run modes allow you to simulate suite behaviour quickly without running the suite’s real jobs - which may be long-running and resource-hungry:
Set the run mode (default live) in the GUI suite start dialog box, or on the command line:
You can get specified tasks to fail in these modes, for more flexible suite testing. See Section A.5.1.20 for simulation configuration.
If task [job]execution time limit is set, Cylc divides it by [simulation]speedup factor (default 10.0) to compute simulated task run lengths (default 10 seconds).
Dummy mode ignores batch scheduler settings because Cylc does not know which job resource directives (requested memory, number of compute nodes, etc.) would need to be changed for the dummy jobs. If you need to dummy-run jobs on a batch scheduler manually comment out script items and modify directives in your live suite, or else use a custom live mode test suite.
Note that the dummy modes ignore all configured task script items including init-script. If your init-script is required to run even dummy tasks on a job host, note that host environment setup should be done elsewhere - see 3.3.3.
The run mode is recorded in the suite run database files. Cylc will not let you restart a non-live mode suite in live mode, or vice versa. To test a live suite in simulation mode just take a quick copy of it and run the the copy in simulation mode.
Reference tests are finite-duration suite runs that abort with non-zero exit status if any of the following conditions occur (by default):
The default shutdown event handler for reference tests is cylc hook check-triggering which compares task triggering information (what triggers off what at run time) in the test run suite log to that from an earlier reference run, disregarding the timing and order of events - which can vary according to the external queueing conditions, runahead limit, and so on.
To prepare a reference log for a suite, run it with the --reference-log option, and manually verify the correctness of the reference run.
To reference test a suite, just run it (in dummy mode for the most comprehensive test without running real tasks) with the --reference-test option.
A battery of automated reference tests is used to test cylc before posting a new release version. Reference tests can also be used to check that a cylc upgrade will not break your own complex suites - the triggering check will catch any bug that causes a task to run when it shouldn’t, for instance; even in a dummy mode reference test the full task job script (sans script items) executes on the proper task host by the proper batch system.
Reference tests can be configured with the following settings:
If the default reference test is not sufficient for your needs, firstly note that you can override the default shutdown event handler, and secondly that the --reference-test option is merely a short cut to the following suite.rc settings which can also be set manually if you wish:
The cylc suite-state command interrogates suite run databases. It has a polling mode that waits for a given task in the target suite to achieve a given state. This can be used to make task scripting wait for a remote task to succeed (for example). The suite graph notation also provides a way to define automatic suite-state polling tasks, which use the same polling command under the hood. Note that cylc suite-state can only trigger off task states in remote suites and does not support triggering off task messages.
Here’s how to trigger a task bar off a task foo in a remote suite called other.suite:
Local task my-foo will poll for the success of foo in suite other.suite, at the same cycle point, succeeding only when or if it succeeds. Other task states can also be polled:
The default polling parameters (e.g. maximum number of polls and the interval between them) are printed by cylc suite-state --help and can be configured if necessary under the local polling task runtime section:
For suites owned by others, or those with run databases in non-standard locations, use the --run-dir option, or in-suite:
If the remote task has a different cycling sequence, just arrange for the local polling task to be on the same sequence as the remote task that it represents. For instance, if local task cat cycles 6-hourly at 0,6,12,18 but needs to trigger off a remote task dog at 3,9,15,21:
For suite-state polling the cycle point of the target task is treated as a literal string so the polling command has to be told if the remote suite has a different cycle point format. Use the --template option for this, or in-suite:
The remote suite does not have to be running when polling commences because the command interrogates the suite run database, not the suite server program.
Note that the graph syntax for suite polling tasks cannot be combined with cycle point offsets, family triggers, or parameterized task notation. This does not present a problem because suite polling tasks can be put on the same cycling sequence as the remote-suite target task (as recommended above), and there is no point in having multiple tasks (family members or parameterized tasks) performing the same polling operation. Task state triggers can be used with suite polling, e.g. to trigger another task if polling fails after 10 tries at 10 second intervals:
Each suite maintains its own log of time-stamped events under the suite server log directory:
By way of example, we will show the complete server log generated (at cylc-7.2.0) by a small suite that runs two 30-second dummy tasks foo and bar for a single cycle point 2017-01-01T00Z before shutting down:
By the task scripting defined above, this suite will stall when foo fails. Then, the suite owner vagrant@cylon manually resets the failed task’s state to succeeded, allowing bar to trigger and the suite to finish and shut down. Here’s the complete suite log for this run:
The information logged here includes:
Note that suite log files are primarily intended for human eyes. If you need to have an external system to monitor suite events automatically, interrogate the sqlite suite run database (see 12.29) rather than parse the log files.
Suite server programs maintain two sqlite databases to record restart checkpoints and various other aspects of run history:
The private DB is for use only by the suite server program. The identical public DB is provided for use by external commands such as cylc suite-state, cylc ls-checkpoints, and cylc report-timings. If the public DB gets locked for too long by an external reader, the suite server program will eventually delete it and replace it with a new copy of the private DB, to ensure that both correctly reflect the suite state.
You can interrogate the public DB with the sqlite3 command line tool, the sqlite3 module in the Python standard library, or any other sqlite interface.
If a suite run directory gets deleted or corrupted, the options for recovery are:
A warm start (see 12.1.2) does not need a suite state checkpoint, but it wipes out prior run history, and it could re-run a significant number of tasks that had already completed.
To restart the suite, the critical Cylc files that must be restored are:
Note this discussion does not address restoration of files generated and consumed by task jobs at run time. How suite data is stored and recovered in your environment is a matter of suite and system design.
In short, you can simply restore the suite service directory, the log directory, and the suite.rc file that is the target of the symlink in the service directory. The service and log directories will come with extra files that aren’t strictly needed for a restart, but that doesn’t matter - although depending on your log housekeeping the log/job directory could be huge, so you might want to be selective about that. (Also in a Rose suite, the suite.rc file does not need to be restored if you restart with rose suite-run - which re-installs suite source files to the run directory).
The public DB is not strictly required for a restart - the suite server program will recreate it if need be - but it is required by cylc ls-checkpoints if you need to identify the right restart checkpoint.
The job status files are only needed if the restart suite state checkpoint contains active tasks that need to be polled to determine what happened to them while the suite was down. Without them, polling will fail and those tasks will need to be manually set to the correct state.
WARNING: it is not safe to copy or rsync a potentially-active sqlite DB - the copy might end up corrupted. It is best to stop the suite before copying a DB, or else write a back-up utility using the official sqlite backup API: http://www.sqlite.org/backup.html.
Small groups of cylc users can of course share suites by manual copying, and generic revision control tools can be used on cylc suites as for any collection of files. Beyond this cylc does not have a built-in solution for suite storage and discovery, revision control, and deployment, on a network. That is not cylc’s core purpose, and large sites may have preferred revision control systems and suite meta-data requirements that are difficult to anticipate. We can, however, recommend the use of Rose to do all of this very easily and elegantly with cylc suites.
Rose is a framework for managing and running suites of scientific applications, developed at the UK Met Office for use with cylc. It is available under the open source GPL license.
This appendix defines all legal suite definition config items. Embedded Jinja2 code (see 9.7) must process to a valid raw suite.rc file. See also 9.2 for a descriptive overview of suite.rc files, including syntax (9.2.1).
The only top level configuration items at present are the suite title and description.
Section containing metadata items for this suite. Several items (title, description, URL) are pre-defined and are used by the GUI. Others can be user-defined and passed to suite event handlers to be interpreted according to your needs. For example, the value of a “suite-priority” item could determine how an event handler responds to failure events.
A single line description of the suite. It is displayed in the GUI “Open Another Suite” window and can be retrieved at run time with the cylc show command.
A multi-line description of the suite. It can be retrieved at run time with the cylc show command.
A web URL to suite documentation. If present it can be browsed with the cylc doc command, or from the gcylc Suite menu. The string template %(suite_name)s will be replaced with the actual suite name. See also task URLs (A.5.1.10.3).
A group name for a suite. In the gscan GUI, suites with the same group name can be collapsed into a single state summary when the “group” column is displayed.
Replace __MANY__ with any user-defined metadata item. These, like title, URL, etc. can be passed to suite event handlers to be interpreted according to your needs. For example, “suite-priority”.
This section is for configuration that is not specifically task-related.
If this item is set cylc will abort if the suite is not started in the specified mode. This can be used for demo suites that have to be run in simulation mode, for example, because they have been taken out of their normal operational context; or to prevent accidental submission of expensive real tasks during suite development.
Cylc runs off the suite host’s system clock by default. This item allows you to run the suite in UTC even if the system clock is set to local time. Clock-trigger tasks will trigger when the current UTC time is equal to their cycle point date-time plus offset; other time values used, reported, or logged by the suite server program will usually also be in UTC. The default for this can be set at the site level (see B.14.1).
To just alter the timezone used in the date-time cycle point format, see A.3.5. To just alter the number of expanded year digits (for years below 0 or above 9999), see A.3.4.
Cylc usually uses a CCYYMMDDThhmmZ (Z in the special case of UTC) or CCYYMMDDThhmm+hhmm format (+ standing for + or - here) for writing down date-time cycle points, which follows one of the basic formats outlined in the ISO 8601 standard. For example, a cycle point on the 3rd of February 2001 at 4:50 in the morning, UTC (+0000 timezone), would be written 20010203T0450Z. Similarly, for the the 3rd of February 2001 at 4:50 in the morning, +1300 timezone, cylc would write 20010203T0450+1300.
You may use the isodatetime library’s syntax to write dates and times in ISO 8601 formats - CC for century, YY for decade and decadal year, +X for expanded year digits and their positive or negative sign, thereafter following the ISO 8601 standard example notation except for fractional digits, which are represented as ,ii for hh, ,nn for mm, etc. For example, to write date-times as week dates with fractional hours, set cycle point format to CCYYWwwDThh,iiZ e.g. 1987W041T08,5Z for 08:30 UTC on Monday on the fourth ISO week of 1987.
You can also use a subset of the strptime/strftime POSIX standard - supported tokens are %F, %H, %M, %S, %Y, %d, %j, %m, %s, %z.
The ISO8601 extended date-time format can be used (%Y-%m-%dT%H:%M) but note that the ‘-’ and ‘:’ characters end up in job log directory paths.
The pre cylc-6 legacy 10-digit date-time format YYYYMMDDHH is not ISO8601 compliant and can no longer be used as the cycle point format. For job scripts that still require the old format, use the cylc cyclepoint utility to translate the ISO8601 cycle point inside job scripts, e.g.:
For years below 0 or above 9999, the ISO 8601 standard specifies that an extra number of year digits and a sign should be used. This extra number needs to be written down somewhere (here).
For example, if this extra number is set to 2, 00Z on the 1st of January in the year 10040 will be represented as +0100400101T0000Z (2 extra year digits used). With this number set to 3, 06Z on the 4th of May 1985 would be written as +00019850504T0600Z.
This number defaults to 0 (no sign or extra digits used).
If you set UTC mode to True (A.3.2) then this will default to Z. If you use a custom cycle point format (A.3.3), you should specify the timezone choice (or null timezone choice) here as well.
You may set your own time zone choice here, which will be used for all date-time cycle point dumping. Time zones should be expressed as ISO 8601 time zone offsets from UTC, such as +13, +1300, -0500 or +0645, with Z representing the special +0000 case. Cycle points will be converted to the time zone you give and will be represented with this string at the end.
Cycle points that are input without time zones (e.g. as an initial cycle point setting) will use this time zone if set. If this isn’t set (and UTC mode is also not set), then they will default to the current local time zone.
Note that the ISO standard also allows writing the hour and minute separated by a ”:” (e.g. +13:00) - however, this is not recommended, given that the time zone is used as part of task output filenames.
Cylc does not normally abort if tasks fail, but if this item is turned on it will abort with exit status 1 if any task fails.
Specify the time interval on which a running cylc suite will check that its run directory exists and that its contact file contains the expected information. If not, the suite will shut itself down automatically.
Group together all the task event mail notifications into a single email within a given interval. This is useful to prevent flooding users’ mail boxes when many task events occur within a short period of time.
This has the same effect as the --no-auto-shutdown flag for the suite run commands: it prevents the suite server program from shutting down normally when all tasks have finished (a suite timeout can still be used to stop the daemon after a period of inactivity, however). This option can make it easier to re-trigger tasks manually near the end of a suite run, during suite development and debugging.
If this is turned on cylc will write the resolved dependencies of each task to the suite log as it becomes ready to run (a list of the IDs of the tasks that actually satisfied its prerequisites at run time). Mainly used for cylc testing and development.
Define parameter values here for use in expanding parameterized tasks - see Section 9.6.
Parameterized task names (see previous item, and Section 9.6) are expanded, for each parameter value, using string templates. You can assign templates to parameter names here, to override the default templates.
Note that the values of a parameter named p are substituted for %(p)s. In _run%(run)s the first “run” is a string literal, and the second gets substituted with each value of the parameter.
Cylc has internal “hooks” to which you can attach handlers that are called by the suite server program whenever certain events occur. This section configures suite event hooks; see A.5.1.13 for task event hooks.
Event handler commands can send an email or an SMS, call a pager, intervene in the operation of their own suite, or whatever. They can be held in the suite bin directory, otherwise it is up to you to ensure their location is in $PATH (in the shell in which cylc runs, on the suite host). The commands should require very little resource to run and should return quickly.
Each event handler can be specified as a list of command lines or command line templates.
A command line template may have any or all of these patterns which will be substituted with actual values:
Otherwise the command line will be called with the following default arguments:
Note: substitution patterns should not be quoted in the template strings. This is done automatically where required.
Additional information can be passed to event handlers via [cylc] →[[environment]].
[cylc] →[[events]] →EVENT handler A comma-separated list of one or more event handlers to call when one of the following EVENTs occurs:
Default values for these can be set at the site level via the siterc file (see B.14.4).
Item details:
[cylc] →[[[events]]] →handlers Specify the general event handlers as a list of command lines or command line templates.
[cylc] →[[events]] →handler events Specify the events for which the general event handlers should be invoked.
[cylc] →[[events]] →mail events Specify the suite events for which notification emails should be sent.
[cylc] →[[events]] →mail footer Specify a string or string template to insert to footers of notification emails for both suite events and task events.
A template string may have any or all of these patterns which will be substituted with actual values:
[cylc] →[[events]] →mail from Specify an alternate from: email address for suite event notifications.
[cylc] →[[events]] →mail smtp Specify the SMTP server for sending suite event email notifications.
[cylc] →[[events]] →mail to A list of email addresses to send suite event notifications. The list can be anything accepted by the mail command.
[cylc] →[[events]] →timeout If a timeout is set and the timeout event is handled, the timeout event handler(s) will be called if the suite stays in a stalled state for some period of time. The timer is set initially at suite start up. It is possible to set a default for this at the site level (see B.14.4).
[cylc] →[[events]] →inactivity If inactivity is set and the inactivity event is handled, the inactivity event handler(s) will be called if there is no activity in the suite for some period of time. The timer is set initially at suite start up. It is possible to set a default for this at the site level (see B.14.4).
[cylc] →[[events]] →reset timer If True (the default) the suite timer will continually reset after any task changes state, so you can time out after some interval since the last activity occured rather than on absolute suite execution time.
[cylc] →[[events]] →abort on stalled If this is set to True it will cause the suite to abort with error status if it stalls. A suite is considered ”stalled” if there are no active, queued or submitting tasks or tasks waiting for clock triggers to be met. It is possible to set a default for this at the site level (see B.14.4).
[cylc] →[[events]] →abort on timeout If a suite timer is set (above) this will cause the suite to abort with error status if the suite times out while still running. It is possible to set a default for this at the site level (see B.14.4).
[cylc] →[[events]] →abort on inactivity If a suite inactivity timer is set (above) this will cause the suite to abort with error status if the suite is inactive for some period while still running. It is possible to set a default for this at the site level (see B.14.4).
[cylc] →[[events]] →abort if EVENT handler fails Cylc does not normally care whether an event handler succeeds or fails, but if this is turned on the EVENT handler will be executed in the foreground (which will block the suite while it is running) and the suite will abort if the handler fails.
Environment variables defined in this section are passed to suite and task event handlers.
[cylc] →[[environment]] →__VARIABLE__ Replace __VARIABLE__ with any number of environment variable assignment expressions. Values may refer to other local environment variables (order of definition is preserved) and are not evaluated or manipulated by cylc, so any variable assignment expression that is legal in the shell in which cylc is running can be used (but see the warning above on variable expansions, which will not be evaluated). White space around the ‘=’ is allowed (as far as cylc’s file parser is concerned these are just suite configuration items).
Reference tests are finite-duration suite runs that abort with non-zero exit status if cylc fails, if any task fails, if the suite times out, or if a shutdown event handler that (by default) compares the test run with a reference run reports failure. See 12.26.
[cylc] →[[reference test]] →suite shutdown event handler A shutdown event handler that should compare the test run with the reference run, exiting with zero exit status only if the test run verifies.
As for any event handler, the full path can be ommited if the script is located somewhere in $PATH or in the suite bin directory.
[cylc] →[[reference test]] →required run mode If your reference test is only valid for a particular run mode, this setting will cause cylc to abort if a reference test is attempted in another run mode.
[cylc] →[[reference test]] →allow task failures A reference test run will abort immediately if any task fails, unless this item is set, or a list of expected task failures is provided (below).
[cylc] →[[reference test]] →expected task failures A reference test run will abort immediately if any task fails, unless allow task failures is set (above) or the failed task is found in a list IDs of tasks that are expected to fail.
[cylc] →[[reference test]] →live mode suite timeout The timeout value, expressed as an ISO 8601 duration/interval, after which the test run should be aborted if it has not finished, in live mode. Test runs cannot be done in live mode unless you define a value for this item, because it is not possible to arrive at a sensible default for all suites.
[cylc] →[[reference test]] →simulation mode suite timeout The timeout value in minutes after which the test run should be aborted if it has not finished, in simulation mode. Test runs cannot be done in simulation mode unless you define a value for this item, because it is not possible to arrive at a sensible default for all suites.
[cylc] →[[reference test]] →dummy mode suite timeout The timeout value, expressed as an ISO 8601 duration/interval, after which the test run should be aborted if it has not finished, in dummy mode. Test runs cannot be done in dummy mode unless you define a value for this item, because it is not possible to arrive at a sensible default for all suites.
Authentication of client programs with suite server programs can be set in the global site/user config files and overridden here if necessary. See B.15 for more information.
[cylc] →[[authentication]] →public The client privilege level granted for public access - i.e. no suite passphrase required. See B.15 for legal values.
Suite-level configuration for the simulation and dummy run modes described in Section 12.25.
[cylc] →[[simulation]] →disable suite event handlers If this is set to True configured suite event handlers will not be called in simulation or dummy modes.
This section allows cylc to determine when tasks are ready to run.
Cylc runs using the proleptic Gregorian calendar by default. This item allows you to either run the suite using the 360 day calendar (12 months of 30 days in a year) or using integer cycling. It also supports use of the 365 (never a leap year) and 366 (always a leap year) calendars.
In a cold start each cycling task (unless specifically excluded under [special tasks]) will be loaded into the suite with this cycle point, or with the closest subsequent valid cycle point for the task. This item can be overridden on the command line or in the gcylc suite start panel.
In date-time cycling, if you do not provide time zone information for this, it will be assumed to be local time, or in UTC if A.3.2 is set, or in the time zone determined by A.3.5 if that is set.
The string “now” converts to the current date-time on the suite host (adjusted to UTC if the suite is in UTC mode but the host is not) to minute resolution. Minutes (or hours, etc.) may be ignored depending on your cycle point format (A.3.3).
Cycling tasks are held once they pass the final cycle point, if one is specified. Once all tasks have achieved this state the suite will shut down. If this item is provided you can override it on the command line or in the gcylc suite start panel.
In date-time cycling, if you do not provide time zone information for this, it will be assumed to be local time, or in UTC if A.3.2 is set, or in the A.3.5 if that is set.
In a cycling suite it is possible to restrict the initial cycle point by defining a list of truncated time points under the initial cycle point constraints.
In a cycling suite it is possible to restrict the final cycle point by defining a list of truncated time points under the final cycle point constraints.
Cycling tasks are held once they pass the hold after cycle point, if one is specified. Unlike the final cycle point suite will not shut down once all tasks have passed this point. If this item is provided you can override it on the command line or in the gcylc suite start panel.
Runahead limiting prevents the fastest tasks in a suite from getting too far ahead of the slowest ones, as documented in 12.16.
This config item specifies a hard limit as a cycle interval between the slowest and fastest tasks. It is deprecated in favour of the newer default limiting by max active cycle points (A.4.8).
Runahead limiting prevents the fastest tasks in a suite from getting too far ahead of the slowest ones, as documented in 12.16.
This config item supersedes the deprecated hard runahead limit (A.4.7). It allows up to N (default 3) consecutive cycle points to be active at any time, adjusted up if necessary for any future triggering.
Allows tasks to spawn out to max active cycle points (A.4.8), removing restriction that a task has to have submitted before its successor can be spawned.
Important: This should be used with care given the potential impact of additional task proxies both in terms of memory and cpu for the cylc daemon as well as overheads in rendering all the additional tasks in gcylc. Also, use of the setting may highlight any issues with suite design relying on the default behaviour where downstream tasks would otherwise be waiting on ones upstream submitting and the suite would have stalled e.g. a housekeeping task at a later cycle deleting an earlier cycle’s data before that cycle has had chance to run where previously the task would not have been spawned until its predecessor had been submitted.
Configuration of internal queues, by which the number of simultaneously active tasks (submitted or running) can be limited, per queue. By default a single queue called default is defined, with all tasks assigned to it and no limit. To use a single queue for the whole suite just set the limit on the default queue as required. See also 12.17.
[scheduling] →[[queues]] →[[[__QUEUE__]]] Section heading for configuration of a single queue. Replace __QUEUE__ with a queue name, and repeat the section as required.
[scheduling] →[[queues]] →[[[__QUEUE__]]] →limit The maximum number of active tasks allowed at any one time, for this queue.
[scheduling] →[[queues]] →[[[__QUEUE__]]] →members A list of member tasks, or task family names, to assign to this queue (assigned tasks will automatically be removed from the default queue).
This section is used to identify tasks with special behaviour. Family names can be used in special task lists as shorthand for listing all member tasks.
[scheduling] →[[special tasks]] →clock-trigger Clock-trigger tasks (see 9.3.5.14) wait on a wall clock time specified as an offset from their own cycle point.
[scheduling] →[[special tasks]] →clock-expire Clock-expire tasks enter the expired state and skip job submission if too far behind the wall clock when they become ready to run. The expiry time is specified as an offset from wall-clock time; typically it should be negative - see 9.3.5.15.
[scheduling] →[[special tasks]] →external-trigger Externally triggered tasks (see 9.3.5.16) wait on external events reported via the cylc ext-trigger command. To constrain triggers to a specific cycle point, include $CYLC_TASK_CYCLE_POINT in the trigger message string and pass the cycle point to the cylc ext-trigger command.
[scheduling] →[[special tasks]] →sequential Sequential tasks automatically depend on their own previous-cycle instance. This declaration is deprecated in favour of explicit inter-cycle triggers - see 9.3.5.12.
[scheduling] →[[special tasks]] →exclude at start-up Any task listed here will be excluded from the initial task pool (this goes for suite restarts too). If an inclusion list is also specified, the initial pool will contain only included tasks that have not been excluded. Excluded tasks can still be inserted at run time. Other tasks may still depend on excluded tasks if they have not been removed from the suite dependency graph, in which case some manual triggering, or insertion of excluded tasks, may be required.
[scheduling] →[[special tasks]] →include at start-up If this list is not empty, any task not listed in it will be excluded from the initial task pool (this goes for suite restarts too). If an exclusion list is also specified, the initial pool will contain only included tasks that have not been excluded. Excluded tasks can still be inserted at run time. Other tasks may still depend on excluded tasks if they have not been removed from the suite dependency graph, in which case some manual triggering, or insertion of excluded tasks, may be required.
The suite dependency graph is defined under this section. You can plot the dependency graph as you work on it, with cylc graph or by right clicking on the suite in the db viewer. See also 9.3.
[scheduling] →[[dependencies]] →graph The dependency graph for a completely non-cycling suites can go here. See also A.4.12.2.1 below and 9.3, for graph string syntax.
[scheduling] →[[dependencies]] →[[[__RECURRENCE__]]] __RECURRENCE__ section headings define the sequence of cycle points for which the subsequent graph section is valid. These should be specified in our ISO 8601 derived sequence syntax, or similar for integer cycling:
See 9.3.3 for more on recurrence expressions, and how multiple graph sections combine.
[scheduling] →[[dependencies]] →[[[__RECURRENCE__]]] →graph The dependency graph for a given recurrence section goes here. Syntax examples follow; see also 9.3 and 9.3.5.
This section is used to specify how, where, and what to execute when tasks are ready to run. Common configuration can be factored out in a multiple-inheritance hierarchy of runtime namespaces that culminates in the tasks of the suite. Order of precedence is determined by the C3 linearization algorithm as used to find the method resolution order in Python language class hiearchies. For details and examples see 9.4.
Replace __NAME__ with a namespace name, or a comma-separated list of names, and repeat as needed to define all tasks in the suite. Names may contain letters, digits, underscores, and hyphens. A namespace represents a group or family of tasks if other namespaces inherit from it, or a task if no others inherit from it.
If multiple names are listed the subsequent settings apply to each.
All namespaces inherit initially from root, which can be explicitly configured to provide or override default settings for all tasks in the suite.
[runtime] →[[__NAME__]] →extra log files A list of user defined log files associated with a task. Files defined here will appear alongside the default log files in the cylc gui. Log files must reside in the job log directory and ideally should be named using the $CYLC_TASK_LOG_ROOT prefix (see 9.4.7.3).
[runtime] →[[__NAME__]] →inherit A list of the immediate parent(s) this namespace inherits from. If no parents are listed root is assumed.
[runtime] →[[__NAME__]] →init-script This is invoked by the task job script before the task execution environment is configured, so it does not have access to any suite or task environment variables. It can be a single command or multiple lines of scripting. The original intention was to allow remote tasks to source login scripts to configure their access to cylc, but this should no longer be necessary (see 12.3). See also env-script, err-script, pre-script, script, and post-script.
[runtime] →[[__NAME__]] →env-script This is invoked by the task job script between the cylc-defined environment (suite and task identity, etc.) and the user-defined task runtime environment - i.e. it has access to the cylc environment, and the task environment has access to variables defined by this scripting. It can be a single command or multiple lines of scripting. See also init-script, err-script, pre-script, script, and post-script.
[runtime] →[[__NAME__]] →err-script This is any custom script to be invoked at the end of the error trap, (if the error trap is triggered due to failure of a command in the task job). The output of this will always be sent to STDERR and $1 is set to the name of the signal caught by the error trap. The script should be fast and use very little system resource to ensure that the error trap can return quickly. It can be a single command or multiple lines of scripting. See also init-script, env-script, pre-script, script, and post-script.
[runtime] →[[__NAME__]] →pre-script This is invoked by the task job script immediately before the script item (just below). It can be a single command or multiple lines of scripting. See also init-script, env-script, err-script, script, and post-script.
[runtime] →[[__NAME__]] →script The is the main user-defined scripting to run when the task is ready. It can be a single command or multiple lines of scripting. See also init-script, env-script, err-script, pre-script, and post-script.
[runtime] →[[__NAME__]] →post-script This is invoked by the task job script immediately after the script item (just above). It can be a single command or multiple lines of scripting. See also init-script, env-script, err-script, pre-script, and script.
[runtime] →[[__NAME__]] →work sub-directory Task job scripts are executed from within work directories created automatically under the suite run directory. A task can get its own work directory from $CYLC_TASK_WORK_DIR (or simply $PWD if it does not cd elsewhere at runtime). The default directory path contains task name and cycle point, to provide a unique workspace for every instance of every task. If several tasks need to exchange files and simply read and write from their from current working directory, this item can be used to override the default to make them all use the same workspace.
The top level share and work directory location can be changed (e.g. to a large data area) by a global config setting (see B.9.1.2).
Note that if you omit cycle point from the work sub-directory path successive instances of the task will share the same workspace. Consider the effect on cycle point offset housekeeping of work directories before doing this.
[runtime] →[[__NAME__]] →[[[meta]]] Section containing metadata items for this task or family namespace. Several items (title, description, URL) are pre-defined and are used by the GUI. Others can be user-defined and passed to task event handlers to be interpreted according to your needs. For example, the value of an “importance” item could determine how an event handler responds to task failure events.
Any suite meta item can now be passed to task event handlers by prefixing the string template item name with “suite_”, for example :
[runtime] →[[__NAME__]] →[[[meta]]] →title A single line description of this namespace. It is displayed by the cylc list command and can be retrieved from running tasks with the cylc show command.
[runtime] →[[__NAME__]] →[[[meta]]] →description A multi-line description of this namespace, retrievable from running tasks with the cylc show command.
[runtime] →[[__NAME__]] →[[[meta]]] →URL A web URL to task documentation for this suite. If present it can be browsed with the cylc doc command, or by right-clicking on the task in gcylc. The string templates %(suite_name)s and %(task_name)s will be replaced with the actual suite and task names. See also suite URLs (A.2.3).
(Note that URLs containing the comment delimiter # must be protected by quotes).
[runtime] →[[__NAME__]] →[[[meta]]] →__MANY__ Replace __MANY__ with any user-defined metadata item. These, like title, URL, etc. can be passed to task event handlers to be interpreted according to your needs. For example, the value of an ”importance” item could determine how an event handler responds to task failure events.
[runtime] →[[__NAME__]] →[[[job]]] This section configures the means by which cylc submits task job scripts to run.
[runtime] →[[__NAME__]] →[[[job]]] →batch system See 11 for how job submission works, and how to define new handlers for different batch systems. Cylc has a number of built in batch system handlers:
[runtime] →[[__NAME__]] →[[[job]]] →execution time limit Specify the execution wall clock limit for a job of the task. For background and at, the job script will be invoked using the timeout command. For other batch systems, the specified time will be automatically translated into the equivalent directive for wall clock limit.
Tasks are polled multiple times, where necessary, when they exceed their execution time limits. (See B.9.1.18.6 for how to configure the polling intervals).
[runtime] →[[__NAME__]] →[[[job]]] →batch submit command template This allows you to override the actual command used by the chosen batch system. The template’s %(job)s will be substituted by the job file path.
[runtime] →[[__NAME__]] →[[[job]]] →shell Location of the command used to interpret the job script submitted by the suite server program when a task is ready to run. This can be set to the location of bash in the job host if the shell is not installed in the standard location. Note: It has no bearing on any sub-shells that may be called by the job script.
Setting this to the path of a ksh93 interpreter is deprecated. Support of which will be withdrawn in a future cylc release. Setting this to any other shell is not supported.
[runtime] →[[__NAME__]] →[[[job]]] →submission retry delays A list of duration (in ISO 8601 syntax), after which to resubmit if job submission fails.
[runtime] →[[__NAME__]] →[[[job]]] →execution retry delays See also 12.18.
A list of ISO 8601 time duration/intervals after which to resubmit the task if it fails. The variable $CYLC_TASK_TRY_NUMBER in the task execution environment is incremented each time, starting from 1 for the first try - this can be used to vary task behaviour by try number.
[runtime] →[[__NAME__]] →[[[job]]] →submission polling intervals A list of intervals, expressed as ISO 8601 duration/intervals, with optional multipliers, after which cylc will poll for status while the task is in the submitted state.
For the polling task communication method this overrides the default submission polling interval in the site/user config files (6). For default and ssh task communications, polling is not done by default but it can still be configured here as a regular check on the health of submitted tasks.
Each list value is used in turn until the last, which is used repeatedly until finished.
A single interval value is probably appropriate for submission polling.
[runtime] →[[__NAME__]] →[[[job]]] →execution polling intervals A list of intervals, expressed as ISO 8601 duration/intervals, with optional multipliers, after which cylc will poll for status while the task is in the running state.
For the polling task communication method this overrides the default execution polling interval in the site/user config files (6). For default and ssh task communications, polling is not done by default but it can still be configured here as a regular check on the health of submitted tasks.
Each list value is used in turn until the last, which is used repeatedly until finished.
[runtime] →[[__NAME__]] →[[[remote]]] Configure host and username, for tasks that do not run on the suite host account. Non-interactive ssh is used to submit the task by the configured batch system, so you must distribute your ssh key to allow this. Cylc must be installed on task remote accounts, but no external software dependencies are required there.
[runtime] →[[__NAME__]] →[[[remote]]] →host The remote host for this namespace. This can be a static hostname, an environment variable that holds a hostname, or a command that prints a hostname to stdout. Host selection commands are executed just prior to job submission. The host (static or dynamic) may have an entry in the cylc site or user config file to specify parameters such as the location of cylc on the remote machine; if not, the corresponding local settings (on the suite host) will be assumed to apply on the remote host.
[runtime] →[[__NAME__]] →[[[remote]]] →owner The username of the task host account. This is (only) used in the non-interactive ssh command invoked by the suite server program to submit the remote task (consequently it may be defined using local environment variables (i.e. the shell in which cylc runs, and [cylc] →[[environment]]).
If you use dynamic host selection and have different usernames on the different selectable hosts, you can configure your $HOME/.ssh/config to handle username translation.
[runtime] →[[__NAME__]] →[[[remote]]] →retrieve job logs Remote task job logs are saved to the suite run directory on the task host, not on the suite host. If you want the job logs pulled back to the suite host automatically, you can set this item to True. The suite will then attempt to rsync the job logs once from the remote host each time a task job completes. E.g. if the job file is ~/cylc-run/tut.oneoff.remote/log/job/1/hello/01/job, anything under ~/cylc-run/tut.oneoff.remote/log/job/1/hello/01/ will be retrieved.
[runtime] →[[__NAME__]] →[[[remote]]] →retrieve job logs max size If the disk space of the suite host is limited, you may want to set the maximum sizes of the job log files to retrieve. The value can be anything that is accepted by the --max-size=SIZE option of the rsync command.
[runtime] →[[__NAME__]] →[[[remote]]] →retrieve job logs retry delays Some batch systems have considerable delays between the time when the job completes and when it writes the job logs in its normal location. If this is the case, you can configure an initial delay and some retry delays between subsequent attempts. The default behaviour is to attempt once without any delay.
[runtime] →[[__NAME__]] →[[[remote]]] →suite definition directory The path to the suite definition directory on the remote account, needed if remote tasks require access to files stored there (via $CYLC_SUITE_DEF_PATH) or in the suite bin directory (via $PATH). If this item is not defined, the local suite definition directory path will be assumed, with the suite owner’s home directory, if present, replaced by '$HOME' for interpretation on the remote account.
[runtime] →[[__NAME__]] →[[[events]]] Cylc can call nominated event handlers when certain task events occur. This section configures specific task event handlers; see A.3.13 for suite events.
Event handlers can be located in the suite bin/ directory, otherwise it is up to you to ensure their location is in $PATH (in the shell in which the suite server program runs). They should require little resource to run and return quickly.
Each task event handler can be specified as a list of command lines or command line templates. They can contain any or all of the following patterns, which will be substituted with actual values:
Otherwise, the command line will be called with the following default arguments:
Note: substitution patterns should not be quoted in the template strings. This is done automatically where required.
For an explanation of the substitution syntax, see String Formatting Operations in the Python documentation: https://docs.python.org/2/library/stdtypes.html#string-_formatting.
Additional information can be passed to event handlers via the [cylc] →[[environment]] (but not via task runtime environments - event handlers are not called by tasks).
[runtime] →[[__NAME__]] →[[[events]]] →EVENT handler A list of one or more event handlers to call when one of the following EVENTs occurs:
Item details:
[runtime] →[[__NAME__]] →[[[events]]] →submission timeout If a task has not started after the specified ISO 8601 duration/interval, the submission timeout event handler(s) will be called.
[runtime] →[[__NAME__]] →[[[events]]] →execution timeout If a task has not finished after the specified ISO 8601 duration/interval, the execution timeout event handler(s) will be called.
[runtime] →[[__NAME__]] →[[[events]]] →reset timer If you set an execution timeout the timer can be reset to zero every time a message is received from the running task (which indicates the task is still alive). Otherwise, the task will timeout if it does not finish in the alotted time regardless of incoming messages.
[runtime] →[[__NAME__]] →[[[events]]] →handlers Specify a list of command lines or command line templates as task event handlers.
[runtime] →[[__NAME__]] →[[[events]]] →handler events Specify the events for which the general task event handlers should be invoked.
[runtime] →[[__NAME__]] →[[[events]]] →handler retry delays Specify an initial delay before running an event handler command and any retry delays in case the command returns a non-zero code. The default behaviour is to run an event handler command once without any delay.
[runtime] →[[__NAME__]] →[[[events]]] →mail events Specify the events for which notification emails should be sent.
[runtime] →[[__NAME__]] →[[[events]]] →mail from Specify an alternate from: email address for event notifications.
[runtime] →[[__NAME__]] →[[[events]]] →mail retry delays Specify an initial delay before running the mail notification command and any retry delays in case the command returns a non-zero code. The default behaviour is to run the mail notification command once without any delay.
[runtime] →[[__NAME__]] →[[[events]]] →mail smtp Specify the SMTP server for sending email notifications.
[runtime] →[[__NAME__]] →[[[events]]] →mail to A list of email addresses to send task event notifications. The list can be anything accepted by the mail command.
[runtime] →[[__NAME__]] →[[[environment]]] The user defined task execution environment. Variables defined here can refer to cylc suite and task identity variables, which are exported earlier in the task job script, and variable assignment expressions can use cylc utility commands because access to cylc is also configured earlier in the script. See also 9.4.7.
[runtime] →[[__NAME__]] →[[[environment]]] →__VARIABLE__ Replace __VARIABLE__ with any number of environment variable assignment expressions. Order of definition is preserved so values can refer to previously defined variables. Values are passed through to the task job script without evaluation or manipulation by cylc, so any variable assignment expression that is legal in the job submission shell can be used. White space around the ‘=’ is allowed (as far as cylc’s suite.rc parser is concerned these are just normal configuration items).
[runtime] →[[__NAME__]] →[[[environment filter]]] This section contains environment variable inclusion and exclusion lists that can be used to filter the inherited environment. This is not intended as an alternative to a well-designed inheritance hierarchy that provides each task with just the variables it needs. Filters can, however, improve suites with tasks that inherit a lot of environment they don’t need, by making it clear which tasks use which variables. They can optionally be used routinely as explicit “task environment interfaces” too, at some cost to brevity, because they guarantee that variables filtered out of the inherited task environment are not used.
Note that environment filtering is done after inheritance is completely worked out, not at each level on the way, so filter lists in higher-level namespaces only have an effect if they are not overridden by descendants.
[runtime] →[[__NAME__]] →[[[environment filter]]] →include If given, only variables named in this list will be included from the inherited environment, others will be filtered out. Variables may also be explicitly excluded by an exclude list.
[runtime] →[[__NAME__]] →[[[environment filter]]] →exclude Variables named in this list will be filtered out of the inherited environment. Variables may also be implicitly excluded by omission from an include list.
[runtime] →[[__NAME__]] →[[[parameter environment templates]]] The user defined task execution parameter environment templates. This is only relevant for parameterized tasks - see Section 9.6.
[runtime] →[[__NAME__]] →[[[parameter environment templates]]] →__VARIABLE__ Replace __VARIABLE__ with pairs of environment variable name and Python string template for parameter substitution. This is only relevant for parameterized tasks - see Section 9.6.
If specified, in addition to the standard CYLC_TASK_PARAM_¡key¿ variables, the job script will also export the named variables specified here, with the template strings substituted with the parameter values.
[runtime] →[[__NAME__]] →[[[directives]]] Batch queue scheduler directives. Whether or not these are used depends on the batch system. For the built-in methods that support directives (loadleveler, lsf, pbs, sge, slurm, moab), directives are written to the top of the task job script in the correct format for the method. Specifying directives individually like this allows use of default directives that can be individually overridden at lower levels of the runtime namespace hierarchy.
[runtime] →[[__NAME__]] →[[[directives]]] →__DIRECTIVE__ Replace __DIRECTIVE__ with each directive assignment, e.g. class = parallel
Example directives for the built-in batch system handlers are shown in 11.1.
[runtime] →[[__NAME__]] →[[[outputs]]] Register custom task outputs for use in message triggering in this section (9.3.5.5)
[runtime] →[[__NAME__]] →[[[outputs]]] →__OUTPUT__ Replace __OUTPUT__ with one or more custom task output messages (9.3.5.5). The item name is used to select the custom output message in graph trigger notation.
[runtime] →[[__NAME__]] →[[[suite state polling]]] Configure automatic suite polling tasks as described in 12.27. The items in this section reflect the options and defaults of the cylc suite-state command, except that the target suite name and the --task, --cycle, and --status options are taken from the graph notation.
[runtime] →[[__NAME__]] →[[[suite state polling]]] →run-dir For your own suites the run database location is determined by your site/user config. For other suites, e.g. those owned by others, or mirrored suite databases, use this item to specify the location of the top level cylc run directory (the database should be a suite-name sub-directory of this location).
[runtime] →[[__NAME__]] →[[[suite state polling]]] →template Cycle point template of the target suite, if different from that of the polling suite.
[runtime] →[[__NAME__]] →[[[suite state polling]]] →interval Polling interval expressed as an ISO 8601 duration/interval.
[runtime] →[[__NAME__]] →[[[suite state polling]]] →max-polls The maximum number of polls before timing out and entering the ‘failed’ state.
[runtime] →[[__NAME__]] →[[[suite state polling]]] →user Username of an account on the suite host to which you have access. The polling cylc suite-state command will be invoked on the remote account.
[runtime] →[[__NAME__]] →[[[suite state polling]]] →host The hostname of the target suite. The polling cylc suite-state command will be invoked on the remote account.
[runtime] →[[__NAME__]] →[[[suite state polling]]] →verbose Run the polling cylc suite-state command in verbose output mode.
[runtime] →[[__NAME__]] →[[[simulation]]] Task configuration for the suite simulation and dummy run modes described in Section 12.25.
[runtime] →[[__NAME__]] →[[[simulation]]] →default run length The default simulated job run length, if [job]execution time limit and [simulation]speedup factor are not set.
[runtime] →[[__NAME__]] →[[[simulation]]] →speedup factor If [job]execution time limit is set, the task simulated run length is computed by dividing it by this factor.
[runtime] →[[__NAME__]] →[[[simulation]]] →time limit buffer For dummy jobs, a new [job]execution time limit is set to the simulated task run length plus this buffer interval, to avoid job kill due to exceeding the time limit.
[runtime] →[[__NAME__]] →[[[simulation]]] →fail cycle points Configure simulated or dummy jobs to fail at certain cycle points.
[runtime] →[[__NAME__]] →[[[simulation]]] →fail try 1 only If this is set to True only the first run of the task instance will fail, otherwise retries will fail too.
[runtime] →[[__NAME__]] →[[[simulation]]] →disable task event handlers If this is set to True configured task event handlers will not be called in simulation or dummy modes.
Configuration of suite graphing for the cylc graph command (graph extent, styling, and initial family-collapsed state) and the gcylc graph view (initial family-collapsed state). Graphviz documentation of node shapes and so on can be found at http://www.graphviz.org/documentation/.
The initial cycle point for graph plotting.
The visualization initial cycle point gets adjusted up if necessary to the suite initial cycling point.
An explicit final cycle point for graph plotting. If used, this overrides the preferred number of cycle points (below).
The visualization final cycle point gets adjusted down if necessary to the suite final cycle point.
The number of cycle points to graph starting from the visualization initial cycle point. This is the preferred way of defining the graph end point, but it can be overridden by an explicit final cycle point (above).
A list of family (namespace) names to be shown in the collapsed state (i.e. the family members will be replaced by a single family node) when the suite is first plotted in the graph viewer or the gcylc graph view. If this item is not set, the default is to collapse all families at first. Interactive GUI controls can then be used to group and ungroup family nodes at will.
Plot graph edges (dependency arrows) with the same color as the upstream node, otherwise default to black.
Plot graph edges (i.e. dependency arrows) with the same fillcolor as the upstream node, if it is filled, otherwise default to black.
Line width of node shape borders.
Line width of graph edges (dependency arrows).
Graph node labels can be printed in the same color as the node outline.
Set the default attributes (color and style etc.) of graph nodes (tasks and families). Attribute pairs must be quoted to hide the internal = character.
Set the default attributes (color and style etc.) of graph edges (dependency arrows). Attribute pairs must be quoted to hide the internal = character.
Define named groups of graph nodes (tasks and families) which can styled en masse, by name, in [visualization] →[[node attributes]]. Node groups are automatically defined for all task families, including root, so you can style family and member nodes at once by family name.
[visualization] →[[node groups]] →__GROUP__ Replace __GROUP__ with each named group of tasks or families.
Here you can assign graph node attributes to specific nodes, or to all members of named groups defined in [visualization] →[[node groups]]. Task families are automatically node groups. Styling of a family node applies to all member nodes (tasks and sub-families), but precedence is determined by ordering in the suite definition. For example, if you style a family red and then one of its members green, cylc will plot a red family with one green member; but if you style one member green and then the family red, the red family styling will override the earlier green styling of the member.
[visualization] →[[node attributes]] →__NAME__ Replace __NAME__ with each node or node group for style attribute assignment.
This section defines all legal items and values for cylc site and user config files. See Site And User Config Files (Section 6) for file locations, intended usage, and how to generate the files using the cylc get-site-config command.
As for suite definitions, Jinja2 expressions can be embedded in site and user config files to generate the final result parsed by cylc. Use of Jinja2 in suite definitions is documented in Section 9.7.
A temporary directory is needed by a few cylc commands, and is cleaned automatically on exit. Leave unset for the default (usually $TMPDIR).
Maximum number of concurrent processes used to execute external job submission, event handlers, and job poll and kill commands - see 12.20.
Interval after which long-running commands in the process pool will be killed - see 12.20.
Commands that intervene in running suites can be made to ask for confirmation before acting. Some find this annoying and ineffective as a safety measure, however, so command prompts are disabled by default.
The suite run directory tree is created anew with every suite start (not restart) but output from the most recent previous runs can be retained in a rolling archive. Set length to 0 to keep no backups. This is incompatible with current Rose suite housekeeping (see Section 13 for more on Rose) so it is disabled by default, in which case new suite run files will overwrite existing ones in the same run directory tree. Rarely, this can result in incorrect polling results due to the presence of old task status files.
The number of old run directory trees to retain if run directory housekeeping is enabled.
When a task host in a suite is a shell command string, cylc calls the shell to determine the task host. This call is invoked by the main process, and may cause the suite to hang while waiting for the command to finish. This setting sets a timeout for such a command to ensure that the suite can continue.
This section contains configuration items that affect task-to-suite communications.
If a send fails, the messaging code will retry after a configured delay interval.
If successive sends fail, the messaging code will give up after a configured number of tries.
This is the same as the --comms-timeout option in cylc commands. Without a timeout remote connections to unresponsive suites can hang indefinitely (suites suspended with Ctrl-Z for instance).
The suite event log, held under the suite run directory, is maintained as a rolling archive. Logs are rolled over (backed up and started anew) when they reach a configurable limit size.
If True, a new suite log will be started for a new suite run.
How many rolled logs to retain in the archive.
Suite event logs are rolled over when they reach this file size.
Documentation locations for the cylc doc command and gcylc Help menus.
File locations of documentation held locally on the cylc host server.
[documentation] →[[files]] →html index File location of the main cylc documentation index.
[documentation] →[[files]] →pdf user guide File location of the cylc User Guide, PDF version.
[documentation] →[[files]] →multi-page html user guide File location of the cylc User Guide, multi-page HTML version.
[documentation] →[[files]] →single-page html user guide File location of the cylc User Guide, single-page HTML version.
Online documentation URLs.
[documentation] →[[urls]] →internet homepage URL of the cylc internet homepage, with links to documentation for the latest official release.
[documentation] →[[urls]] →local index Local intranet URL of the main cylc documentation index.
PDF and HTML viewers can be launched by cylc to view the documentation.
Your preferred PDF viewer program.
Your preferred web browser.
Choose your favourite text editor for editing suite definitions.
The editor to be invoked by the cylc command line interface.
The editor to be invoked by the cylc GUI.
This section covers options for network communication between cylc clients (suite-connecting commands and guis) servers (running suites). Each suite listens on a dedicated network port, binding on the first available starting at the configured base port.
By default, the communication method is HTTPS secured with HTTP Digest Authentication. If the system does not support SSL, you should configure this section to use HTTP. Cylc will not automatically fall back to HTTP if HTTPS is not available.
The choice of client-server communication method - currently only HTTPS and HTTP are supported, although others could be developed and plugged in. Cylc defaults to HTTPS if this setting is not explicitly configured.
The first port that cylc is allowed to use.
This determines the maximum number of suites that can run at once on the suite host.
Enable or disable proxy servers for HTTPS - disabled by default.
Option flags for the communication method. Currently only ’SHA1’ is supported for HTTPS, which alters HTTP Digest Auth to use the SHA1 hash algorithm rather than the standard MD5. This is more secure but is also less well supported by third party web clients including web browsers. You may need to add the ’SHA1’ option if you are running on platforms where MD5 is discouraged (e.g. under FIPS).
Configurable settings for the command line cylc monitor tool.
The sort order for tasks in the monitor view.
The [hosts] section configures some important host-specific settings for the suite host (‘localhost’) and remote task hosts. Note that remote task behaviour is determined by the site/user config on the suite host, not on the task host. Suites can specify task hosts that are not listed here, in which case local settings will be assumed, with the local home directory path, if present, replaced by $HOME in items that configure directory locations.
The default task host is the suite host, localhost, with default values as listed below. Use an explicit [hosts][[localhost]] section if you need to override the defaults. Localhost settings are then also used as defaults for other hosts, with the local home directory path replaced as described above. This applies to items omitted from an explicit host section, and to hosts that are not listed at all in the site and user config files. Explicit host sections are only needed if the automatically modified local defaults are not sufficient.
Host section headings can also be regular expressions to match multiple hostnames. Note that the general regular expression wildcard is ‘.⋆’ (zero or more of any character), not ‘⋆’. Hostname matching regular expressions are used as-is in the Python re.match() function. As such they match from the beginning of the hostname string (as specified in the suite definition) and they do not have to match through to the end of the string (use the string-end matching character ‘$’ in the expression to force this).
A hierachy of host match expressions from specific to general can be used because config items are processed in the order specified in the file.
[hosts] →[[HOST]] →run directory The top level of the directory tree that holds suite-specific output logs, run database, etc.
[hosts] →[[HOST]] →work directory The top level for suite work and share directories.
[hosts] →[[HOST]] →task communication method The means by which task progress messages are reported back to the running suite. See above for default polling intervals for the poll method.
[hosts] →[[HOST]] →execution polling intervals Cylc can poll running jobs to catch problems that prevent task messages from being sent back to the suite, such as hard job kills, network outages, or unplanned task host shutdown. Routine polling is done only for the polling task communication method (below) unless suite-specific polling is configured in the suite definition. A list of interval values can be specified, with the last value used repeatedly until the task is finished - this allows more frequent polling near the beginning and end of the anticipated task run time. Multipliers can be used as shorthand as in the example below.
[hosts] →[[HOST]] →submission polling intervals Cylc can also poll submitted jobs to catch problems that prevent the submitted job from executing at all, such as deletion from an external batch scheduler queue. Routine polling is done only for the polling task communication method (above) unless suite-specific polling is configured in the suite definition. A list of interval values can be specified as for execution polling (above) but a single value is probably sufficient for job submission polling.
[hosts] →[[HOST]] →scp command A string for the command used to copy files to a remote host. This is not used on the suite host unless you run local tasks under another user account. The value is assumed to be scp with some initial options or a command that implements a similar interface to scp.
[hosts] →[[HOST]] →ssh command A string for the command used to invoke commands on this host. This is not used on the suite host unless you run local tasks under another user account. The value is assumed to be ssh with some initial options or a command that implements a similar interface to ssh.
[hosts] →[[HOST]] →use login shell Whether to use a login shell or not for remote command invocation. By default cylc runs remote ssh commands using a login shell:
which will source /etc/profile and ~/.profile to set up the user environment. However, for security reasons some institutions do not allow unattended commands to start login shells, so you can turn off this behaviour to get:
which will use the default shell on the remote machine, sourcing ~/.bashrc (or ~/.cshrc) to set up the environment.
[hosts] →[[HOST]] →cylc executable The cylc executable on a remote host. Note this should normally point to the cylc multi-version wrapper (see 7.2) on the host, not bin/cylc for a specific installed version. Specify a full path if cylc is not in \$PATH when it is invoked via ssh on this host.
[hosts] →[[HOST]] →global init-script If specified, the value of this setting will be inserted to just before the init-script section of all job scripts that are to be submitted to the specified remote host.
[hosts] →[[HOST]] →copyable environment variables A list containing the names of the environment variables that can and/or need to be copied from the suite server program to a job.
[hosts] →[[HOST]] →retrieve job logs Global default for the A.5.1.12.3 setting for the specified host.
[hosts] →[[HOST]] →retrieve job logs command If rsync -a is unavailable or insufficient to retrieve job logs from a remote host, you can use this setting to specify a suitable command.
[hosts] →[[HOST]] →retrieve job logs max size Global default for the A.5.1.12.4 setting for the specified host.
[hosts] →[[HOST]] →retrieve job logs retry delays Global default for the A.5.1.12.5 setting for the specified host.
[hosts] →[[HOST]] →task event handler retry delays Host specific default for the A.5.1.13.7 setting.
[hosts] →[[HOST]] →tail command template A command template (with %(filename)s substitution) to tail-follow job logs on HOST, by the GUI log viewer and cylc cat-log. You are unlikely to need to override this.
[hosts] →[[HOST]] →[[[batch systems]]] Settings for particular batch systems on HOST. In the subsections below, SYSTEM should be replaced with the cylc batch system handler name that represents the batch system (see A.5.1.11.1).
[hosts] →[[HOST]] →[[[batch systems]]] →[[[[SYSTEM]]]] →err tailer A command template (with %(job_id)s substitution) that can be used to tail-follow the stderr stream of a running job if SYSTEM does not use the normal log file location while the job is running. This setting overrides B.9.1.17 above.
[hosts] →[[HOST]] →[[[batch systems]]] →[[[[SYSTEM]]]] →out tailer A command template (with %(job_id)s substitution) that can be used to tail-follow the stdout stream of a running job if SYSTEM does not use the normal log file location while the job is running. This setting overrides B.9.1.17 above.
[hosts] →[[HOST]] →[[[batch systems]]] →[[[[SYSTEM]]]] →err viewer A command template (with %(job_id)s substitution) that can be used to view the stderr stream of a running job if SYSTEM does not use the normal log file location while the job is running.
[hosts] →[[HOST]] →[[[batch systems]]] →[[[[SYSTEM]]]] →out viewer A command template (with %(job_id)s substitution) that can be used to view the stdout stream of a running job if SYSTEM does not use the normal log file location while the job is running.
[hosts] →[[HOST]] →[[[batch systems]]] →[[[[SYSTEM]]]] →job name length maximum The maximum length for job name acceptable by a batch system on a given host. Currently, this setting is only meaningful for PBS jobs. For example, PBS 12 or older will fail a job submit if the job name has more than 15 characters, which is the default setting. If you have PBS 13 or above, you may want to modify this setting to a larger value.
[hosts] →[[HOST]] →[[[batch systems]]] →[[[[SYSTEM]]]] →execution time limit polling intervals The intervals between polling after a task job (submitted to the relevant batch system on the relevant host) exceeds its execution time limit. The default setting is PT1M, PT2M, PT7M. The accumulated times (in minutes) for these intervals will be roughly 1, 1 + 2 = 3 and 1 + 2 + 7 = 10 after a task job exceeds its execution time limit.
The suite host’s identity must be determined locally by cylc and passed to running tasks (via $CYLC_SUITE_HOST) so that task messages can target the right suite on the right host.
This item determines how cylc finds the identity of the suite host. For the default name method cylc asks the suite host for its host name. This should resolve on remote task hosts to the IP address of the suite host; if it doesn’t, adjust network settings or use one of the other methods. For the address method, cylc attempts to use a special external “target address” to determine the IP address of the suite host as seen by remote task hosts (in-source documentation in <cylc-dir>/lib/cylc/hostuserutil.py explains how this works). And finally, as a last resort, you can choose the hardwired method and manually specify the host name or IP address of the suite host.
This item is required for the address self-identification method. If your suite host sees the internet, a common address such as google.com will do; otherwise choose a host visible on your intranet.
Use this item to explicitly set the name or IP address of the suite host if you have to use the hardwired self-identification method.
Utilities such as cylc gscan need to scan hosts for running suites.
A list of hosts to scan for running suites.
Global site/user defaults for A.5.1.13.
Settings for the automated development tests. Note the test battery reads <cylc-dir>/etc/global-tests.rc instead of the normal site/user global config files.
The name of a remote host that sees the same HOME file system as the host running the test battery.
Host name of a remote account that does not see the same home directory as the account running the test battery - see also “remote owner” below).
User name of a remote account that does not see the same home directory as the account running the test battery - see also “remote host” above).
Settings for testing supported batch systems (job submission methods). The tests for a batch system are only performed if the batch system is available on the test host or a remote host accessible via SSH from the test host.
[test battery] →[[batch systems]] →[[[SYSTEM]]] SYSTEM is the name of a supported batch system with automated tests. This can currently be ”loadleveler”, ”lsf”, ”pbs”, ”sge” and/or ”slurm”.
[test battery] →[[batch systems]] →[[[SYSTEM]]] →host The name of a host where commands for this batch system is available. Use ”localhost” if the batch system is available on the host running the test battery. Any specified remote host should be accessible via SSH from the host running the test battery.
[test battery] →[[batch systems]] →[[[SYSTEM]]] →err viewer The command template (with \%(job_id)s substitution) for testing the run time stderr viewer functionality for this batch system.
[test battery] →[[batch systems]] →[[[SYSTEM]]] →out viewer The command template (with \%(job_id)s substitution) for testing the run time stdout viewer functionality for this batch system.
[test battery] →[[batch systems]] →[[[SYSTEM]]] →[[[[directives]]]] The minimum set of directives that must be supplied to the batch system on the site to initiate jobs for the tests.
Default values for entries in the suite.rc [cylc] section.
Allows you to set a default value for UTC mode in a suite at the site level. See A.3.2 for details.
Site default suite health check interval. See A.3.7 for details.
Site default task event mail interval. See A.3.8 for details.
You can define site defaults for each of the following options, details of which can be found under A.3.13:
[cylc] →[[events]] →handler events
[cylc] →[[events]] →startup handler
[cylc] →[[events]] →shutdown handler
[cylc] →[[events]] →mail events
[cylc] →[[events]] →mail footer
[cylc] →[[events]] →timeout handler
[cylc] →[[events]] →abort on timeout
[cylc] →[[events]] →stalled handler
[cylc] →[[events]] →abort on stalled
[cylc] →[[events]] →inactivity handler
[cylc] →[[events]] →inactivity
[cylc] →[[events]] →abort on inactivity
Authentication of client programs with suite server programs can be configured here, and overridden in suites if necessary (see A.3.16).
The suite-specific passphrase must be installed on a user’s account to authorize full control privileges (see 7.5 and 12.9). In the future we plan to move to a more traditional user account model so that each authorized user can have their own password.
This sets the client privilege level for public access - i.e. no suite passphrase required.
This section defines all legal items and values for the gcylc user config file, which should be located in $HOME/.cylc/gcylc.rc. Current settings can be printed with the cylc get-gui-config command.
Set the size of the task state dot icons displayed in the text and dot views.
Set the suite view panels initial orientation when the GUI starts. This can be changed later using the “View” menu “Toggle views side-by-side” option.
Set the suite view panel(s) displayed initially, when the GUI starts. This can be changed later using the tool bar.
Set the maximum (longest) time interval between calls to the suite for data update.
The update frequency of the GUI is variable. It is determined by considering the time of last update and the mean duration of the last 10 main loops of the suite.
In general, the GUI will use an update frequency that matches the mean duration of the suite’s main loop. In quiet time (or if the suite is not contactable), it will gradually increase the update interval (i.e. reduce the update frequency) to a maximum determined by this setting.
Increasing this setting will reduce the network traffic and hits on the suite process. However, if a quiet suite starts to pick up activity, the GUI may initially appear out of sync with what is happening in the suite for the duration of this interval.
If this is not turned off the default sort order for task names and families in the dot and text views will the order they appear in the suite definition. Clicking on the task name column in the treeview will toggle to alphanumeric sort, and a View menu item does the same for the dot view. If turned off, the default sort order is alphanumeric and definition order is not available at all.
If “text” is in initial views then sort column sets the column that will be sorted initially when the GUI launches. Sorting can be changed later by clicking on the column headers.
For use in combination with sort column, sets whether the column will be sorted using ascending or descending order.
The color used to highlight active task filters in gcylc. It must be a name from the X11 rgb.txt file, e.g. SteelBlue; or a quoted hexadecimal color code, e.g. "#ff0000" for red (quotes are required to prevent the hex code being interpreted as a comment).
Set the initial filtering options when the GUI starts. Later this can be changed by using the ”View” menu ”Task Filtering” option.
Transposes the content in dot view so that it displays from left to right rather than from top to bottom. Can be changed later using the options submenu available via the view menu.
Transposes the content in graph view so that it displays from left to right rather than from top to bottom. Can be changed later using the options submenu via the view menu.
List suite views, if any, that should be displayed initially in an ungrouped state. Namespace family grouping can be changed later using the tool bar.
Set the task state color theme, common to all views, to use initially. The color theme can be changed later using the tool bar. See etc/gcylc.rc.eg and etc/gcylc-themes.rc in the Cylc installation directory for how to modify existing themes or define your own. Use cylc get-gui-config to list available themes.
Sets the size (in pixels) of the cylc GUI at startup.
This section may contain task state color theme definitions.
The name of the task state color-theme to be defined in this section.
[themes] →[[THEME]] →inherit You can inherit from another theme in order to avoid defining all states.
[themes] →[[THEME]] →defaults Set default icon attributes for all state icons in this theme.
For the attribute values, COLOR and FONTCOLOR can be color names from the X11 rgb.txt file, e.g. SteelBlue; or hexadecimal color codes, e.g. #ff0000 for red; and STYLE can be “filled” or “unfilled”. See etc/gcylc.rc.eg and etc/gcylc-themes.rc in the Cylc installation directory for examples.
[themes] →[[THEME]] →STATE Set icon attributes for all task states in THEME, or for a subset of them if you have used theme inheritance and/or defaults. Legal values of STATE are any of the cylc task proxy states: waiting, runahead, held, queued, ready, submitted, submit-failed, running, succeeded, failed, retrying, submit-retrying.
For the attribute values, COLOR and FONTCOLOR can be color names from the X11 rgb.txt file, e.g. SteelBlue; or hexadecimal color codes, e.g. #ff0000 for red; and STYLE can be “filled” or “unfilled”. See etc/gcylc.rc.eg and etc/gcylc-themes.rc in the Cylc installation directory for examples.
This section defines all legal items and values for the gscan config file which should be located in $HOME/.cylc/gscan.rc. Some items also affect the gpanel panel app.
The main menubar can be hidden to maximise the display area. Its visibility can be toggled via the mouse right-click menu, or by typing Alt-m. When visible, the main View menu allows you to change properties such as the columns that are displayed, which hosts to scan for running suites, and the task state icon theme.
At startup, the task state icon theme and icon size are taken from the gcylc config file $HOME/.cylc/gcylc.rc.
Set whether cylc gpanel will activate automatically when the gui is loaded or not.
Set the columns to display when the cylc gscan GUI starts. This can be changed later with the View menu. The order in which the columns are specified here does not affect the display order.
Set the time interval between refreshing the suite listing (by file system or port range scan).
Increasing this setting will reduce the frequency of gscan looking for running suites. Scanning for suites by port range scan can be a hit on the network and the running suite processes, while scanning for suites by walking the file system can hit the file system (especially if the file system is a network file system). Therefore, this is normally set with a lower frequency than the status update interval. Increasing this setting will make gscan friendlier to the network and/or the file system, but gscan may appear out of sync if there are many start up or shut down of suites between the intervals.
Set the time interval between calls to known running suites (suites that are known via the latest suite listing) for data updates.
Increasing this setting will reduce the network traffic and hits on the suite processes. However, gscan may appear out of sync with what may be happening in very busy suites.
Sets the size in pixels of the cylc gscan GUI window at startup.
Hide the main menubar of the cylc gscan GUI window at startup. By default, the menubar is initially hidden. Either way, you can toggle its visibility with Alt-m or via the right-click menu.
Managing tasks in a workflow requires more than just job execution: Cylc performs additional actions with rsync for file transfer, and direct execution of cylc sub-commands over non-interactive SSH.5
Some sites may want to restrict access to job hosts by whitelisting SSH connections to allow only rsync for file transfer, and allowing job execution only via a local batch system that sees the job hosts.6 We are investigating the feasibility of SSH-free job management when a local batch system is available, but this is not yet possible unless your suite and job hosts also share a filesystem, which allows Cylc to treat jobs as entirely local.7
Cylc does not have persistent agent processes running on job hosts to act on instructions received over the network8 so instead we execute job management commands directly on job hosts over SSH. Reasons for this include:
The following suite, registered as suitex, is used to illustrate our current SSH-based remote job management. It submits two jobs to a remote, and a local task views a remote job log then polls and kills the remote jobs.
The delayer task just separates suite start-up from remote job submission, for clarity when watching the job host (e.g. with watch -n 1 find ~/cylc-run/suitex).
Global config specifies the path to the remote Cylc executable, says to retrieve job logs, and not to use a remote login shell:
On running the suite, remote job host actions were captured in the transcripts below by wrapping the ssh, scp, and rsync executables in scripts that log their command lines before taking action.
NOTE the rest of this section is omitted in the HTML User Guide because the complex formatting translates badly to HTML. Please see the PDF User Guide for details.
The graph view in the gcylc GUI shows the structure of the suite as it evolves. It can work well even for large suites, but be aware that the graphviz layout engine has to do a new global layout every time a task proxy appears in or disappears from the task pool. The following may help mitigate any jumping layout problems:
Cylc suite server programs and clients (commands, cylc gui, task messaging) communicate via particular ports using the HTTPS protocol, secured by HTTP Digest Authentication using the suite’s 20-random-character private passphrase and private SSL certificate.
This is enabled via the included-in-cylc cherrypy library (for the server) and either the Python requests library (if available) or the built-in Python libraries for the clients.
All suites are entirely isolated from one another.
Cylc 6 introduced new date-time-related syntax for the suite.rc file. In some places, this is quite radically different from the earlier syntax.
Timeouts and delays such as [cylc][[events]]timeout or [runtime][[my_task]][[[job]]]execution retry delays were written in a purely numeric form before cylc 6, in seconds, minutes (most common), or hours, depending on the setting.
They are now written in an ISO 8601 duration form, which has the benefit that the units are user-selectable (use 1 day instead of 1440 minutes) and explicit.
Nearly all timeouts and delays in cylc were in minutes, except for:
[runtime][[my_task]][[[suite state polling]]]interval
[runtime][[my_task]][[[simulation mode]]]run time range
which were in seconds, and
[scheduling]runahead limit
which was in hours (this is a special case discussed below in L.2).
See Table 1.
| Setting | Pre-Cylc-6 | Cylc-6+ |
[cylc][[events]]timeout | 180 | PT3H |
[runtime][[my_task]][[[job]]]execution retry delays | 2*30, 360, | 2*PT30M, PT6H, |
| 1440 | P1D | |
[runtime][[my_task]][[[suite state polling]]]interval | 2 | PT2S |
See A.4.7.
The [scheduling]runahead limit setting was written as a number of hours in pre-cylc-6 suites. This is now in ISO 8601 format for date-time cycling suites, so [scheduling]runahead limit=36 would be written [scheduling]runahead limit=PT36H.
There is a new preferred alternative to runahead limit, [scheduling]max active cycle points. This allows the user to configure how many cycle points can run at once (default 3). See A.4.8.
See A.4.2.
The following suite.rc settings have changed name (Table 2):
| Pre-Cylc-6 | Cylc-6+ |
[scheduling]initial cycle time | [scheduling]initial cycle point |
[scheduling]final cycle time | [scheduling]final cycle point |
[visualization]initial cycle time | [visualization]initial cycle point |
[visualization]final cycle time | [visualization]final cycle point |
This change is to reflect the fact that cycling in cylc 6+ can now be over e.g. integers instead of being purely based on date-time.
Date-times written in initial cycle time and final cycle time were in a cylc-specific 10-digit (or less) CCYYMMDDhh format, such as 2014021400 for 00:00 on the 14th of February 2014.
Date-times are now required to be ISO 8601 compatible. This can be achieved easily enough by inserting a T between the day and the hour digits.
| Setting | Pre-Cylc-6 | Cylc-6+ |
[scheduling]initial cycle time | 2014021400 | 20140214T00 |
Special start-up and cold-start tasks have been removed from cylc 6. Instead, use the initial/run-once notation as detailed in 7.23.3 and 9.3.4.7.
Repeating asynchronous tasks have also been removed because non date-time workflows can now be handled more easily with integer cycling. See for instance the satellite data processing example documented in 9.3.4.8.
For repeating tasks with hour-based cycling the syntax has only minor changes:
Pre-cylc-6:
Hour-based cycling section names are easy enough to convert, as seen in Table 4.
| Pre-Cylc-6 | Cylc-6+ |
[scheduling][[dependencies]][[[0]]] | [scheduling][[dependencies]][[[T00]]] |
[scheduling][[dependencies]][[[6]]] | [scheduling][[dependencies]][[[T06]]] |
[scheduling][[dependencies]][[[12]]] | [scheduling][[dependencies]][[[T12]]] |
[scheduling][[dependencies]][[[18]]] | [scheduling][[dependencies]][[[T18]]] |
The graph text in hour-based cycling is also easy to convert, as seen in Table 5.
| Pre-Cylc-6 | Cylc-6+ |
| my_task[T-6] | my_task[-PT6H] |
| my_task[T-12] | my_task[-PT12H] |
| my_task[T-24] | my_task[-PT24H] or evenmy_task[-P1D] |
Prior to cylc-6 intercycle offset triggers implicitly created task instances at the offset cycle points. For example, this pre cylc-6 suite automatically creates instances of task foo at the offset hours 3,9,15,21 each day, for task bar to trigger off at 0,6,12,18:
Here’s the direct translation to cylc-6+ format:
This suite fails validation with ERROR: No cycling sequences defined for foo, and at runtime it would stall with bar instances waiting on non-existent offset foo instances (note that these appear as ghost nodes in graph visualisations).
To fix this, explicitly define the cycling of with an offset cycling sequence: foo:
Implicit task creation by offset triggers is no longer allowed because it is error prone: a mistaken task cycle point offset should cause a failure rather than automatically creating task instances on the wrong cycling sequence.
The best place to find current known issues is on Github: https://github.com/cylc/cylc/issues.
In bash, the return status of a pipeline is normally the exit status of the last command. This is unsafe, because if any command in the pipeline fails, the script will continue nevertheless.
For safety, a cylc task job script running in bash will have the set -o pipefail option turned on automatically. If a pipeline exists in a task’s script, etc section, the failure of any part of a pipeline will cause the command to return a non-zero code at the end, which will be reported as a task job failure. Due to the unique nature of a pipeline, the job file will trap the failure of the individual commands, as well as the whole pipeline, and will attempt to report a failure back to the suite twice. The second message is ignored by the suite, and so the behaviour can be safely ignored. (You should probably still investigate the failure, however!)
Copyright Ⓒ 2007 Free Software Foundation, Inc. http://fsf.org/
Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.
Preamble
The GNU General Public License is a free, copyleft license for software and other kinds of works.
The licenses for most software and other practical works are designed to take away your freedom to share and change the works. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change all versions of a program–to make sure it remains free software for all its users. We, the Free Software Foundation, use the GNU General Public License for most of our software; it applies also to any other work released this way by its authors. You can apply it to your programs, too.
When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for them if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs, and that you know you can do these things.
To protect your rights, we need to prevent others from denying you these rights or asking you to surrender the rights. Therefore, you have certain responsibilities if you distribute copies of the software, or if you modify it: responsibilities to respect the freedom of others.
For example, if you distribute copies of such a program, whether gratis or for a fee, you must pass on to the recipients the same freedoms that you received. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights.
Developers that use the GNU GPL protect your rights with two steps: (1) assert copyright on the software, and (2) offer you this License giving you legal permission to copy, distribute and/or modify it.
For the developers’ and authors’ protection, the GPL clearly explains that there is no warranty for this free software. For both users’ and authors’ sake, the GPL requires that modified versions be marked as changed, so that their problems will not be attributed erroneously to authors of previous versions.
Some devices are designed to deny users access to install or run modified versions of the software inside them, although the manufacturer can do so. This is fundamentally incompatible with the aim of protecting users’ freedom to change the software. The systematic pattern of such abuse occurs in the area of products for individuals to use, which is precisely where it is most unacceptable. Therefore, we have designed this version of the GPL to prohibit the practice for those products. If such problems arise substantially in other domains, we stand ready to extend this provision to those domains in future versions of the GPL, as needed to protect the freedom of users.
Finally, every program is threatened constantly by software patents. States should not allow patents to restrict development and use of software on general-purpose computers, but in those that do, we wish to avoid the special danger that patents applied to a free program could make it effectively proprietary. To prevent this, the GPL assures that patents cannot be used to render the program non-free.
The precise terms and conditions for copying, distribution and modification follow.
Terms and Conditions
“This License” refers to version 3 of the GNU General Public License.
“Copyright” also means copyright-like laws that apply to other kinds of works, such as semiconductor masks.
“The Program” refers to any copyrightable work licensed under this License. Each licensee is addressed as “you”. “Licensees” and “recipients” may be individuals or organizations.
To “modify” a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission, other than the making of an exact copy. The resulting work is called a “modified version” of the earlier work or a work “based on” the earlier work.
A “covered work” means either the unmodified Program or a work based on the Program.
To “propagate” a work means to do anything with it that, without permission, would make you directly or secondarily liable for infringement under applicable copyright law, except executing it on a computer or modifying a private copy. Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well.
To “convey” a work means any kind of propagation that enables other parties to make or receive copies. Mere interaction with a user through a computer network, with no transfer of a copy, is not conveying.
An interactive user interface displays “Appropriate Legal Notices” to the extent that it includes a convenient and prominently visible feature that (1) displays an appropriate copyright notice, and (2) tells the user that there is no warranty for the work (except to the extent that warranties are provided), that licensees may convey the work under this License, and how to view a copy of this License. If the interface presents a list of user commands or options, such as a menu, a prominent item in the list meets this criterion.
The “source code” for a work means the preferred form of the work for making modifications to it. “Object code” means any non-source form of a work.
A “Standard Interface” means an interface that either is an official standard defined by a recognized standards body, or, in the case of interfaces specified for a particular programming language, one that is widely used among developers working in that language.
The “System Libraries” of an executable work include anything, other than the work as a whole, that (a) is included in the normal form of packaging a Major Component, but which is not part of that Major Component, and (b) serves only to enable use of the work with that Major Component, or to implement a Standard Interface for which an implementation is available to the public in source code form. A “Major Component”, in this context, means a major essential component (kernel, window system, and so on) of the specific operating system (if any) on which the executable work runs, or a compiler used to produce the work, or an object code interpreter used to run it.
The “Corresponding Source” for a work in object code form means all the source code needed to generate, install, and (for an executable work) run the object code and to modify the work, including scripts to control those activities. However, it does not include the work’s System Libraries, or general-purpose tools or generally available free programs which are used unmodified in performing those activities but which are not part of the work. For example, Corresponding Source includes interface definition files associated with source files for the work, and the source code for shared libraries and dynamically linked subprograms that the work is specifically designed to require, such as by intimate data communication or control flow between those subprograms and other parts of the work.
The Corresponding Source need not include anything that users can regenerate automatically from other parts of the Corresponding Source.
The Corresponding Source for a work in source code form is that same work.
All rights granted under this License are granted for the term of copyright on the Program, and are irrevocable provided the stated conditions are met. This License explicitly affirms your unlimited permission to run the unmodified Program. The output from running a covered work is covered by this License only if the output, given its content, constitutes a covered work. This License acknowledges your rights of fair use or other equivalent, as provided by copyright law.
You may make, run and propagate covered works that you do not convey, without conditions so long as your license otherwise remains in force. You may convey covered works to others for the sole purpose of having them make modifications exclusively for you, or provide you with facilities for running those works, provided that you comply with the terms of this License in conveying all material for which you do not control copyright. Those thus making or running the covered works for you must do so exclusively on your behalf, under your direction and control, on terms that prohibit them from making any copies of your copyrighted material outside their relationship with you.
Conveying under any other circumstances is permitted solely under the conditions stated below. Sublicensing is not allowed; section 10 makes it unnecessary.
No covered work shall be deemed part of an effective technological measure under any applicable law fulfilling obligations under article 11 of the WIPO copyright treaty adopted on 20 December 1996, or similar laws prohibiting or restricting circumvention of such measures.
When you convey a covered work, you waive any legal power to forbid circumvention of technological measures to the extent such circumvention is effected by exercising rights under this License with respect to the covered work, and you disclaim any intention to limit operation or modification of the work as a means of enforcing, against the work’s users, your or third parties’ legal rights to forbid circumvention of technological measures.
You may convey verbatim copies of the Program’s source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice; keep intact all notices stating that this License and any non-permissive terms added in accord with section 7 apply to the code; keep intact all notices of the absence of any warranty; and give all recipients a copy of this License along with the Program.
You may charge any price or no price for each copy that you convey, and you may offer support or warranty protection for a fee.
You may convey a work based on the Program, or the modifications to produce it from the Program, in the form of source code under the terms of section 4, provided that you also meet all of these conditions:
A compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an “aggregate” if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation’s users beyond what the individual works permit. Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate.
You may convey a covered work in object code form under the terms of sections 4 and 5, provided that you also convey the machine-readable Corresponding Source under the terms of this License, in one of these ways:
A separable portion of the object code, whose source code is excluded from the Corresponding Source as a System Library, need not be included in conveying the object code work.
A “User Product” is either (1) a “consumer product”, which means any tangible personal property which is normally used for personal, family, or household purposes, or (2) anything designed or sold for incorporation into a dwelling. In determining whether a product is a consumer product, doubtful cases shall be resolved in favor of coverage. For a particular product received by a particular user, “normally used” refers to a typical or common use of that class of product, regardless of the status of the particular user or of the way in which the particular user actually uses, or expects or is expected to use, the product. A product is a consumer product regardless of whether the product has substantial commercial, industrial or non-consumer uses, unless such uses represent the only significant mode of use of the product.
“Installation Information” for a User Product means any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work in that User Product from a modified version of its Corresponding Source. The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made.
If you convey an object code work under this section in, or with, or specifically for use in, a User Product, and the conveying occurs as part of a transaction in which the right of possession and use of the User Product is transferred to the recipient in perpetuity or for a fixed term (regardless of how the transaction is characterized), the Corresponding Source conveyed under this section must be accompanied by the Installation Information. But this requirement does not apply if neither you nor any third party retains the ability to install modified object code on the User Product (for example, the work has been installed in ROM).
The requirement to provide Installation Information does not include a requirement to continue to provide support service, warranty, or updates for a work that has been modified or installed by the recipient, or for the User Product in which it has been modified or installed. Access to a network may be denied when the modification itself materially and adversely affects the operation of the network or violates the rules and protocols for communication across the network.
Corresponding Source conveyed, and Installation Information provided, in accord with this section must be in a format that is publicly documented (and with an implementation available to the public in source code form), and must require no special password or key for unpacking, reading or copying.
“Additional permissions” are terms that supplement the terms of this License by making exceptions from one or more of its conditions. Additional permissions that are applicable to the entire Program shall be treated as though they were included in this License, to the extent that they are valid under applicable law. If additional permissions apply only to part of the Program, that part may be used separately under those permissions, but the entire Program remains governed by this License without regard to the additional permissions.
When you convey a copy of a covered work, you may at your option remove any additional permissions from that copy, or from any part of it. (Additional permissions may be written to require their own removal in certain cases when you modify the work.) You may place additional permissions on material, added by you to a covered work, for which you have or can give appropriate copyright permission.
Notwithstanding any other provision of this License, for material you add to a covered work, you may (if authorized by the copyright holders of that material) supplement the terms of this License with terms:
All other non-permissive additional terms are considered “further restrictions” within the meaning of section 10. If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term. If a license document contains a further restriction but permits relicensing or conveying under this License, you may add to a covered work material governed by the terms of that license document, provided that the further restriction does not survive such relicensing or conveying.
If you add terms to a covered work in accord with this section, you must place, in the relevant source files, a statement of the additional terms that apply to those files, or a notice indicating where to find the applicable terms.
Additional terms, permissive or non-permissive, may be stated in the form of a separately written license, or stated as exceptions; the above requirements apply either way.
You may not propagate or modify a covered work except as expressly provided under this License. Any attempt otherwise to propagate or modify it is void, and will automatically terminate your rights under this License (including any patent licenses granted under the third paragraph of section 11).
However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice.
Termination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License. If your rights have been terminated and not permanently reinstated, you do not qualify to receive new licenses for the same material under section 10.
You are not required to accept this License in order to receive or run a copy of the Program. Ancillary propagation of a covered work occurring solely as a consequence of using peer-to-peer transmission to receive a copy likewise does not require acceptance. However, nothing other than this License grants you permission to propagate or modify any covered work. These actions infringe copyright if you do not accept this License. Therefore, by modifying or propagating a covered work, you indicate your acceptance of this License to do so.
Each time you convey a covered work, the recipient automatically receives a license from the original licensors, to run, modify and propagate that work, subject to this License. You are not responsible for enforcing compliance by third parties with this License.
An “entity transaction” is a transaction transferring control of an organization, or substantially all assets of one, or subdividing an organization, or merging organizations. If propagation of a covered work results from an entity transaction, each party to that transaction who receives a copy of the work also receives whatever licenses to the work the party’s predecessor in interest had or could give under the previous paragraph, plus a right to possession of the Corresponding Source of the work from the predecessor in interest, if the predecessor has it or can get it with reasonable efforts.
You may not impose any further restrictions on the exercise of the rights granted or affirmed under this License. For example, you may not impose a license fee, royalty, or other charge for exercise of rights granted under this License, and you may not initiate litigation (including a cross-claim or counterclaim in a lawsuit) alleging that any patent claim is infringed by making, using, selling, offering for sale, or importing the Program or any portion of it.
A “contributor” is a copyright holder who authorizes use under this License of the Program or a work on which the Program is based. The work thus licensed is called the contributor’s “contributor version”.
A contributor’s “essential patent claims” are all patent claims owned or controlled by the contributor, whether already acquired or hereafter acquired, that would be infringed by some manner, permitted by this License, of making, using, or selling its contributor version, but do not include claims that would be infringed only as a consequence of further modification of the contributor version. For purposes of this definition, “control” includes the right to grant patent sublicenses in a manner consistent with the requirements of this License.
Each contributor grants you a non-exclusive, worldwide, royalty-free patent license under the contributor’s essential patent claims, to make, use, sell, offer for sale, import and otherwise run, modify and propagate the contents of its contributor version.
In the following three paragraphs, a “patent license” is any express agreement or commitment, however denominated, not to enforce a patent (such as an express permission to practice a patent or covenant not to sue for patent infringement). To “grant” such a patent license to a party means to make such an agreement or commitment not to enforce a patent against the party.
If you convey a covered work, knowingly relying on a patent license, and the Corresponding Source of the work is not available for anyone to copy, free of charge and under the terms of this License, through a publicly available network server or other readily accessible means, then you must either (1) cause the Corresponding Source to be so available, or (2) arrange to deprive yourself of the benefit of the patent license for this particular work, or (3) arrange, in a manner consistent with the requirements of this License, to extend the patent license to downstream recipients. “Knowingly relying” means you have actual knowledge that, but for the patent license, your conveying the covered work in a country, or your recipient’s use of the covered work in a country, would infringe one or more identifiable patents in that country that you have reason to believe are valid.
If, pursuant to or in connection with a single transaction or arrangement, you convey, or propagate by procuring conveyance of, a covered work, and grant a patent license to some of the parties receiving the covered work authorizing them to use, propagate, modify or convey a specific copy of the covered work, then the patent license you grant is automatically extended to all recipients of the covered work and works based on it.
A patent license is “discriminatory” if it does not include within the scope of its coverage, prohibits the exercise of, or is conditioned on the non-exercise of one or more of the rights that are specifically granted under this License. You may not convey a covered work if you are a party to an arrangement with a third party that is in the business of distributing software, under which you make payment to the third party based on the extent of your activity of conveying the work, and under which the third party grants, to any of the parties who would receive the covered work from you, a discriminatory patent license (a) in connection with copies of the covered work conveyed by you (or copies made from those copies), or (b) primarily for and in connection with specific products or compilations that contain the covered work, unless you entered into that arrangement, or that patent license was granted, prior to 28 March 2007.
Nothing in this License shall be construed as excluding or limiting any implied license or other defenses to infringement that may otherwise be available to you under applicable patent law.
If conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot convey a covered work so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not convey it at all. For example, if you agree to terms that obligate you to collect a royalty for further conveying from those to whom you convey the Program, the only way you could satisfy both those terms and this License would be to refrain entirely from conveying the Program.
Notwithstanding any other provision of this License, you have permission to link or combine any covered work with a work licensed under version 3 of the GNU Affero General Public License into a single combined work, and to convey the resulting work. The terms of this License will continue to apply to the part which is the covered work, but the special requirements of the GNU Affero General Public License, section 13, concerning interaction through a network will apply to the combination as such.
The Free Software Foundation may publish revised and/or new versions of the GNU General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns.
Each version is given a distinguishing version number. If the Program specifies that a certain numbered version of the GNU General Public License “or any later version” applies to it, you have the option of following the terms and conditions either of that numbered version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the GNU General Public License, you may choose any version ever published by the Free Software Foundation.
If the Program specifies that a proxy can decide which future versions of the GNU General Public License can be used, that proxy’s public statement of acceptance of a version permanently authorizes you to choose that version for the Program.
Later license versions may give you additional or different permissions. However, no additional obligations are imposed on any author or copyright holder as a result of your choosing to follow a later version.
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
If the disclaimer of warranty and limitation of liability provided above cannot be given local legal effect according to their terms, reviewing courts shall apply local law that most closely approximates an absolute waiver of all civil liability in connection with the Program, unless a warranty or assumption of liability accompanies a copy of the Program in return for a fee.
End of Terms and Conditions
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively state the exclusion of warranty; and each file should have at least the “copyright” line and a pointer to where the full notice is found.
Also add information on how to contact you by electronic and paper mail.
If the program does terminal interaction, make it output a short notice like this when it starts in an interactive mode:
The hypothetical commands show w and show c should show the appropriate parts of the General Public License. Of course, your program’s commands might be different; for a GUI interface, you would use an “about box”.
You should also get your employer (if you work as a programmer) or school, if any, to sign a “copyright disclaimer” for the program, if necessary. For more information on this, and how to apply and follow the GNU GPL, see http://www.gnu.org/licenses/.
The GNU General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. But first, please read http://www.gnu.org/philosophy/why-not-lgpl.html.
1Future plans for EcoConnect include additional deterministic regional weather forecasts and a statistical ensemble.
2An OR operator on the right doesn’t make much sense: if “B or C” triggers off A, what exactly should cylc do when A finishes?
3In NWP forecast analysis suites parts of the observation processing and data assimilation subsystem will typically also depend on model background fields generated by the previous forecast.
4Late notification of clock-triggered tasks is not very useful in any case because they typically do not depend on other tasks, and as such they can often trigger on time even if the suite is delayed to the point that downstream tasks are late due to their dependence on previous-cycle tasks that are delayed.
5Cylc used to run bare shell expressions over SSH, which required a bash shell and made whitelisting difficult.
6A malicious script could be rsync’d and run from a batch job, but batch jobs are considered easier to audit.
7The job ID must also be valid to query and kill the job via the local batch system. This is not the case for Slurm, unless the --cluster option is explicitly used in job query and kill commands, otherwise the job ID is not recognized by the local Slurm instance.
8This would be a more complex solution, in terms of implementation, administration, and security.